A generative sequencer in a box

I’m still working on the name, but overall I’m very happy with how this project turned out. It was heavily inspired by the last unit on generative art, but this assignment had the additional stipulation that the artifact had to be both interactive and performable. This emphasis on performance is especially challenging, because it implies that the interface must be designed in an intuitive way, meaning that it can only have a simple set of parameters. Luckily, a generative system is well suited to the task. It can produce entirely new art with each execution, even without the benefit of external input. This output can even be constrained to stay more or less within the realm of what is aesthetically pleasing.

Final product

The object I created ended up being a smallish box that I’ve labeled TunePlayer. It’s able to generate algorithmic or entirely random sequences of notes, that can be played in multiple ways. The algorithms that produce these notes are written in a custom, somewhat music-oriented language (or domain specific language/DSL), which is interpreted directly on the device. The language, while minimal, includes support for many familiar constructs, such as if statements and loops.

Feature overview

Tone output through built-in speaker (simple square wave)
MIDI output over low-energy bluetooth
Note generating DSL interpreted entirely on the device
Ability to store and select from multiple sequencing algorithms
Premium cardboard enclosure

The front of the device

The right side of the device

The left side of the device

General plan

The idea I started from was to create some kind of device that would generate musical notes through an algorithmic process. The resulting notes could then either be played on the device through a speaker, or, as a stretch goal, transmitted over midi to a real instrument. While I could have gone ahead and hard-coded a basic set of interesting algorithms into the device, I thought this would be far too limiting. I wanted to make a device that could be used for actual performance, and that meant a greater degree of customizability was necessary. Instead of just making the algorithms heavily parameterized, I decided they should be completely replaceable. My original vision was that a user, before a performance, could connect to the device from a computer or phone and load in a new “generative sequencer” program to the device. In this way, the only limitation on the creativity and variety of sequences that could be produced would be the constraints of whatever domain specific language was used. Ultimately, I wasn’t able to get dynamic loading of new programs to work, due to the lack of easily-modifiable persistent storage on the ESP32. However, new programs can still be added to the device by embedding them in the Arduino C++ file instead, and re-flashing, meaning I was able to achieve this goal part of the way.

Domain specific language

I was particularly excited that I got to create a programming language for this project, as that’s a field I’m very interested in. However, I can’t completely take credit for the design, as my language draws pretty direct inspiration from a language called Forth, that’s been around since the 70s. It struck me that Forth was a good fit for the task of sequencing due to its being a stack-based language. A stack is just one kind of sequence, so the language already has many tools for manipulating sequences built in. Added to this was the fact that, because of its simplicity, Forth is very easy to implement.

Forth basics

You might be wondering what exactly it means for a language to be “stack-based”, like Forth, and the DSL. A stack is a basic data structure in computer science, with some simple properties. The easiest way to understand it is to look at the analogy in the name. Imagine a stack of books. You can pile multiple books together, but you have to pile them in order, one on top of the other, or they would have nothing to rest on. This mean you can only add elements to the top of the stack. In the same way, when you go to remove a book from the stack, you have to take it from the top, or else risk knocking over the others above it. This makes the stack an example of what’s called a last-in first-out, or LIFO data structure. Elements must be removed in reverse of the order in which they were added.

Stack-based languages take this data structure as their foundation, making even the simplest operation a manipulation of the stack. They are often characterized by a “concatenate” style, in the sense that data flows between functions in such a way that they can be chained together. This will become clearer with an example.

In Forth, every program has a main stack, which acts as a sort of scratch space for intermediate results of operations. If we wanted to add together two values, we would have to push them to top of the stack first, and then call the “+” function which would fetch them back, like so:

5 3 + ⇒ 8

Five gets pushed to the stack first, followed by three, then + **“pops” the off values, and adds them together, pushing the result onto the stack. The . function, just pops one item from the top of the stack and displays it, in this case the result of the addition: 8.

Making music

The music-making DSL I created is a subset of Forth, with the crucial addition that whatever is left on the stack at the end of a program’s execution will be treated as a sequence of frequencies to be played. For instance, the following toy program:

100 200 300 400

would simply place these values on the stack in order. After it runs, the tone generator looks through the stack and, finding these values, plays a tone of each frequency, one after another. This results in four notes of gradually increasing pitch. Once each note has been played, the program can be re-executed, creating a loop.

Complex examples

While a more detailed introduction to Forth is outside the present scope I can still show the programs that come preloaded on the device, and give an overview of how they work.

The first program is somewhat self-explanatory:

220 247 262 294 330 349 392

It creates an ascending scale starting from A3 (which has a frequency of 220hz). The next one is a little bit more complex, as it involves looping:

5 do 2 5 i - pow 131 * loop

This generates a sequence of ascending C notes starting with C3. The leading 5 means that there will be five notes produced. The words within the loop (between do and loop) simply multiply the base frequency (C3) by 2, raised to the loop index. This works to create higher C notes because the frequency of a note doubles with each octave. This final example shows how randomness can be brought into the generation:

131 2 5 0 rand pow *

This program only generates one note, but because the device handles looping automatically, the program will be continually re-executed. Much like the last one, it uses the trick of multiplying by a power of 2 to pitch shift a C3 note upwards by a certain number of octaves. This number is determined by the “rand” word, which generates an integer value between 0 and 5, as specified by the proceeding 5 0 (the apparent inversion is idiomatic in Forth).

Further motivation

While these examples hopefully provide some insight into how music could theoretically be made with my language, there’s certainly nothing terribly musical about them. However, here we see the advantage of the extreme customizability provided by the device. Put into the hands of a musically inclined person, the creative possibilities are endless once they have been familiarized with the language. What I’m imagining is that, if a more refined final product were created from this prototype, a community would be able to form around it that could share and take inspiration from each other. The drive for openness and hackability is already beginning to take hold in the synth world. It’s most visible in devices like the Korg NTS-1 and the Minilogue XD. Both contain an extensible sound engine, that users can create new oscillators and effects for using low level DSP code. This feature greatly expands the potential of these synthesizers, and a community really has sprung up around them where people can share their improvements. It seems like a powerful trend, and I imagine that the sequencer market, which primarily values the variety of sequences that can be produced, could benefit from the same openness.

Tune Player