Conditionals in Ciaramella

Posted by Stefano on Sat, 06 Aug 2022 11:03:41 GMT.

Ciaramella is not the first declarative dataflow language around. Remarkable members of this category of languages, at least to some degree, include FAUST, Lustre, Kronos, Reaktor Core, and Max's Gen.

Last time I discussed why we were unsatisfied with them and had to invent our own. Essentially, to me they don't really seem to be designed for actual development of music tech products (Lustre being innocent here, as it is openly not designed for that). Plus, I am somewhat irritated by their strong coupling between syntax and complier internals which cripples modularity (details in this paper, and I have to admit we might have mistreated Reaktor Core somewhat, but it's still practically unusable for other reasons IMO).

Another aspect in which these languages are lacking or needlessly complicated is branching/conditional execution (vulgarly, "if"s, "switch"es, etc.). There are valid technical and historical reasons for this, but I believe it's time to get past them. In this post I'll tell you what we're doing about it in Ciaramella.

Data flow and control flow

It's been a dozen years since I took my complier design courses at the university, so I might be somewhat inaccurate here, but I clearly remember a good part was about data flow and control flow analysis. Now, I think the terminology used there is somewhat misleading when used straight in the context of DSP languages, as it refers to concepts and representations that are related but not identical to what we have in our domain. As a quick example, nodes in data flow graphs for compilers are typically static operations, while nodes in synchronous data flow (SDF) graphs can contain state. You may think this is obvious but I have the impression that most works in our domain are plagued by varying levels of ambiguity and confusion regarding these topics.

If I remember correctly, in modern compiler technology control flow dominates over data flow, as in control flow graphs (CFGs) with nodes containing blocks of code in static single-assignment (SSA) form and control data flow graphs (CDFGs), which are CFGs where each node is a data flow graph (DFG). CFGs are easily built when analyzing imperative code as potential execution schedules are explicit, but that is not the case when dealing with purely declarative languages such as ours, especially since we defer scheduling to the last possible moment in order to preserve modularity. Furthermore, while data flow representations are "just" a mean to perform optimizations in conventional languages, in our case the data flow structure is the underlying model of the language itself.

The whole SDF paradigm seems to crackle when dealing with branching. Relatively recent research has tried to extend the basic model in all sorts of ways, but AFAIK there's no clear winner yet. In practice, some languages like FAUST just avoid it (the closest you get are selectors, but all "branches" are always executed). Otherwise, one might be tempted to encapsulate conditional branches and see them as a black box from the outside, which would harm modularity.

We have studied the problem and I believe we found an elegant and modular solution for our use case and we are already implementing right now.

Syntax additions

First, we decided to add anonymous blocks to the language that give limited scope/accessibility to part of a block. Essentially, you can specify something like this.

y = block(x) {
    h = 2

    # anonymous block
    y1, y2 = {
        h = 1       # this is only accessible from within the anonymous block
        y1 = h      # here h refers to the internal one
        y2 = x      # x and other external variables ares still accessible
    }

    y = y1 + y2 + h # h here is the external one, overall you get y = x + 3
}

And since we were there, we also added the possibility of defining named blocks within a block for internal use only. And everything can be nested ad libitum. All is absolutely orthogonal, so each inner block can be stateful, have its own initial conditions, etc.

In a similar fashion, we added specific syntax for if/else branching.

var1, var2, var3 = if (condition) {
    anonymous block code only executed if condition is true
} else {
    anonymous block code only executed if condition is false
}

Simple as that. And again such statements can be freely nested. Right now the condition is just an expression that is taken to be true when its result is not 0, as we still lack types etc. This will change at some point.

Such an arrangement is flexible enough to directly implement downsampling without any specific syntax. For example, here's a zero-order hold 1:2 downsampler.

y = downsampler(x) {
    y, s = if (delay1(s)) {
        y = x
        s = 0
    } else {
        y = delay1(y)
        s = 1
    }
    @s = 1
    @y = 0
}

While the output is still at the same sample rate as the input, each branch is executed only half of the time, so you could use one to do your computations and the other to just copy the previous output. You would however need to embed your code within the branch of choice, which is clearly not modular, or replicate this structure in e.g., a module in series etc., so this is probably not optimal in the general case. We will see if we find a way to express multirate processing concisely and effectively in the future. (Block pointers maybe?)

Scheduling

Since anonymous blocks and internal named blocks are, after all, just syntactic sugar, no change was needed in the scheduling algorithm to accomodate them.

On the contrary, braching requires quite some effort to get working properly. First of all, the condition expression becomes a dependency for the output variables. Then, the branches are first treated as black boxes. If no delay-free loop arises, then it's safe to schedule the whole conditional expression as a single entity. Otherwise, we're in front of a pathological case which might still be computable, hence the branches need to be "open up" and individually analyzed together with all "surrounding" code involved in the loop(s).

All of this is rather complicated and I'm not into many important details right now (after all, it's Paolo taking care of all of this), so I'm not discussing this further. When everything will be sorted out, implemented, and tested, we will document it all. I have no idea yet how important this could be in other fields where SDF is used, but we will certainly try to publish a scientific paper about it.

What next

We might add also a switch statement in the future — we need integer types first.

Other than that, I take the occasion to announce that we will be on vacation for the next two or three weeks.