JR
We talked a little bit about this idea that maybe the language itself is sort of a conveyer of content as well as a command system. And I'm a little bit ambivalent about that idea, i don't know what do you all think about that.
HHR
Ambivalent in what way?
JR
Well, to have this double function. I think there's something powerful about them being separate and interweaving communicative language with functional language. There's something appealing about that to me and I don't know why, what it is.
HHR
Perhaps also because now we are surrounded by these devices that are made in that way: "Siri do this, Siri do that" or "Alexa buy this, Alexa buy that". What is then your statement by using that same kind of approach? That's what I would wonder... You can play with it because it is also an antique interface in a somehow, now it sounds new because you have all the processing power to do the recognition. But the voice command interfaces were already there in the late 80s or the early 90s. You know, with the phones you could voice command your voice box and this kind of stuff. When you call to a service line and you need to speak slowly in this shitty voice interfaces, that would be much faster to type it. Like "Do you want to go to your billings? Say billings", this kind of stuff. That's kind of the aesthetic context that you situate yourself if you build this kind of command driven interfaces. I would also find more interesting if it was more about the computer trying to make sense of what you are saying, while you don't have to adjust your way of speaking. And either it gets what you want or it does something else, no matter what, it's doing something. That's the important part, that it reacts and tries to interpret what you doing. Maybe it just tries to interpret the dynamic of your speech or your intonation. You know, different parameters that might circumvent this idea that the speech would end to be reduced to this very formal system of logic characters, and could maybe even contain more of these analog elements of the articulation, the accidentals of the speech. I would find interesting this question of how you could work with speech as an interface without going through this bottleneck of a speech to text processor. You reduce all that rich information to trying to extract those words. And then you match the words and do something but play the sound or adjust the volume of the sound. Something that is more ambivalent or more rich, in a way.
JR
Yeah, I mean, maybe coming back to machine learning, that's an interesting use case for a learning system also. To sort of train it on temporal characteristics of the voice and have it be a little bit messy intentionally. And use this as control parameters for maybe not controlling a synthesis process but plugging things together in a more sort of live coding way of thinking. Like initiating processes or interconnecting processes a little bit more modularly. So then it also does become kind of like a coding language in a sense. It becomes a command language, but one that's not so differentiated in its commands. One where the commands are not so discrete but are a little bit more mobile I guess. Yeah, that could be interesting also. And an interesting connection to the whole discussion about memory and machine learning.