The Rubber Duck That Talked Back

In software engineering there is a practice that is called “rubberducking”. Or more precisely: rubber duck debugging. Look it up, it is real. The expression comes from a hugely transformative book, called The Pragmatic Programmer, that was written at the end of the last millennium. Rubberducking is a very simple technique that has been profoundly useful to sufficiently many people so that it stuck around. It goes like that: When you have to reason through an intricate problem, then explaining it to a rubber duck that sits on top of your monitor, will help you to figure it out. That’s about it. In its original form, rubberducking was specifically about finding hard-to-spot bugs in a large amount of lines of source code – but you can use it for any kind of problem. It also does not need a yellow rubber duck sitting on a monitor. You can use any toy or other inanimate object that you prefer. It will work. It will work brilliantly.

Why does it work?

Well, for a couple of reasons. First, by speaking your thoughts out loud you are forced to transform them from your inner thinking into externalized structures like words, phrases, or even full sentences. You are thinking aloud. Doing so inadvertently surfaces all the silent assumptions and subconscious conclusions you made – and the gaps therein. You will easily recognize those gaps, for they are hard to put into words. Once those gaps are known you can of course close them by attaining the missing knowledge. This in turn will increase your comprehension of the topic. In short: the rubber duck helped.

Another reason is that through speaking aloud you hear your words back. Sounds trivial. And it is, but it is also powerful. It sounded better in my head is a common saying for good reason. Alas I can personally attest that most of the things that sound great in my head somehow become incoherent noise once they leave my mouth on a gust of air. I am sure you have observed this with yourself and others as well. It is a rather common thing. A human thing. Hence, the rubber duck helps again (not only the speaker, but also the audience) by being the silent recipient of those bumbling first attempts to formulate a thought.

When doesn’t it work?

The rubber duck is very helpful, but it is no magic bullet. While it assists with applying reason to find a solution, it still requires you to do the legwork – so to speak – of gathering all the arguments and acquiring all the information that underpin them. In a word: research.

Research is the process of expanding your knowledge. There are endless possible ways to approach that. My own “standard algorithm” – that I apply when writing more involved blog posts – starts with finding and reading popular articles that target a non-expert audience in the domain I am interested in. This usually teaches me the right words to find broader “meta-studies”: writings that summarize the learnings in a discipline. This step makes me dig into the methodologies and the language that the field uses to communicate – at least to some depth. Armed with this knowledge, I can find, read and understand deeper papers and articles. This is usually sufficient to get the answers to the questions that come up while writing.

If you think this sounds like a lot of effort then let me assure you that you are correct. It is.

In my day job my research approach is similar, although I am usually targeting solutions to specific problems and I am normally already deep into the topic that I am investigating. You are probably doing something comparable. At least if you are working as any kind of knowledge worker.

None of that is new. It comes with the territory of working with information. It takes a lot of time.

Enter: Conversational AI Assistant Systems

For about a year or so, I have had help with research. I hired an assistant. Well, not quite. I started using Conversational AI Assistant Systems (CAAS) like ChatGPT, Bard and Claude. A lot. Concurrently. It is impossible for me to convey how much easier my research and subsequently writing has become. It made the process at least two magnitudes less painful – maybe more. It, quite literally, got me thinking. In this case, it got me thinking: why does this work so well? I mean, I had the same information accessible before. Via normal web search. Googling. Why is this new interface so much better supporting my research?

From what I can tell there are multiple answers. Let’s start with the obvious one: CAAS are large language models (LLM). They are trained on mind bogglingly huge datasets and can aggregate the interconnected information into coherent answers. This is a significant improvement over googling one thing, which just gives you two more things to look for, which each give you two more things and so forth. Keeping track of all these different threads of thought is not easy. Human error is likely. The risk of finding yourself in a deep rabbit hole is continuously mounting. Not so with CAAS, that gives you a combined result in one understandable chunk. This compares best to the meta-studies I was mentioning earlier – only they are made on-the-fly for arbitrary questions that I have.

Now I can get to why I started with the story about rubber ducks above: CAAS helps you massively to formulate your thoughts. It turns out that it matters not whether you speak your thoughts out loud or write them down. As long as you do so with the intention that they can be understood by someone else the same corrective effect occurs. With one slight difference: This rubber duck talks back! And this is not a trivial difference. At the very least your questions will be reflected back to you through the answers the LLM spits back to you: If your formulation was ambiguous you will find that so is the reply. If the answer is seemingly wrong or unfitting: chances are your question was not what you wanted to ask. In short: CAAS gives you a continuous feedback loop like a conversation with a (human) expert would (and yes: you can actually have a spoken conversation instead of a typed chat, if you prefer). More so, this feedback loop is guiding you to the answer in a way that smells very familiar to another known good practice: the root cause analysis (or “20 questions”).

Lastly the questions you ask a CAAS are rarely simple singular words or short sentences. You can formulate immensely more complex and nuanced queries that you can express in multi-paragraph length. And you will. The amount of information and conditions such questions can contain is vastly greater than what you can type into a web crawler search input like Google or Bing (until they become a direct interface to a CAAS as well, of course). You can also iterate repeatedly, which allows you to keep spinning your thoughts almost frictionless. CAAS allows your thinking to flow unhindered.

All good, then?

As with any good thing: there are caveats. For CAAS, or rather the LLMs that underpin them, you will find serious concerns enumerated everywhere so I will make this short:

Hallucinations, where produced results (answers) are entirely wrong, nonsensical, surreal, etc. Trusting results blindly can easily produce harm in the real world.
Bias, where generated results are skewed due to unknown (or when malicious: known) trends within the training data. Prejudices propagate unhindered and can even threaten lives.
Privacy concerns and security risks, where inputs leak into the training data and become available to other users or the providers that own the systems or malicious third parties.
Creativity deterrent, which results from outsourcing thinking. I, for one, have a hard time navigating without Google Maps these days. How about you?

This is certainly not an exhaustive list, but it should paint a picture of the scope of the problem. My stance is simple: Use with care. You can harm yourself (or others) with any tool. CAAS is just that: a very complex and very sophisticated tool. Personally, my biggest concern is the last item on the list: impairing our ability to think creatively. The other items are rather technical problems that will find a solution, given sufficient time and resources. But this last item has indeed the (slight) potential to cause significant harm in the long term.

However, where there are serious risks, there are also equally substantial chances and I am very excited to see what comes next.

What’s next?

I think that CAAS will become ubiquitous, because it is the next generation of interface and it will enable a whole new generation of massively more powerful (and complex) tools that will make life and work easier. Here comes my wishlist:

“In 1986, if you added up all the information being blasted at the average human being - TV, radio, reading - it amounted to 40 newspapers-worth of information every day. By 2007, they found it had risen to the equivalent of 174 newspapers per day.” – Johann Hari, Stolen Focus, 2022

Living in the Information Age is wrought with the problem of every individual being exposed to an ever increasing bandwidth of information. CAAS is the first tool that gives me hope that we will be able to mitigate or even reverse this destructive trend. Not by decreasing the amount of information that hits you, but by increasing the amount of information you can handle. Imagine a fully integrated CAAS that has access to your emails, slack messages and any other communication channels you use. And your documents. And your pictures. And everything else. Now imagine your CAAS automatically summarizes and prioritizes all communication streams for you – on the same level and with the same quality a highly trained, full-time human assistant can do today. Basically: CAAS has the potential to give you the signal while keeping away the noise. It could do even more: replying in your stead and interest to (some of the easier) communications. Or at least gathering all requested information and preparing replies or reply variants for your review. You get the gist. All of this would be possible today, however (afaik) it is not (yet) offered by any provider. I would call this a fully integrated, personal CAAS.

This immediately brings up another item: Trust. Currently, all CAAS are provided and maintained by private companies. You do not know what will happen to your data that you store with them. Maybe they are all great people with good morals - but who knows who buys them tomorrow? Maybe they have great intentions – but their security posture is far less comprehensive than they think and they lose all your highly private communication to some internet pirate from overseas. This is just a small excerpt of all the possible horror scenarios. Independent of this, there is also the question of data accessibility: say only your documents are with Google, but your company messaging is with Slack and your mails are hosted by a small local outfit. Google’s CAAS will likely not be able access all of that. So to bring about this shiny future where CAAS can double as a full personal assistant there needs to be a different approach that assuages privacy concerns and is not tightly coupled with a single data-sphere. This could be solved in multiple ways. One would be to “offline CAAS”. This is entirely possible. A lot of very big LLMs are already available for download. Big players like Apple are actively working to bring LLM execution to mobile devices. This can be done! We need trusted, private, vendor-independent CAAS.

Lastly, if I can have my way, all of the above will be carried on the wings of an agreed upon CAAS standard. Right now, as a user, you need to come up with creative workarounds like the word document that my wife uses to store an “initial conversation” that she uses to “seed” the CAAS with context. This is a hack. It is a hack that works, but it is a hack nonetheless. What I want is a transportable behavior profile that contains all the information of how you want the CAAS to behave, what it should know and what of your data it should have access to. Maybe there are even transportable discussion profiles, that capture the state of a investigation, so that it can be easily continued at a later time with a different CAAS. All of that I want of course to be vendor-independent, so that I can utilize different CAAS providers - online or offline. In short: I want the execution (LLM) to be separate from the data (profile) and the data to be transportable.

Ok, that is it for now. The future stays exciting. Thanks for reading.

Disclaimer: I am not professionally involved in developing AI, ML, LLM - or anything related to that. I am writing as a user of such technologies.

Ulrich Kautz Blog

The Rubber Duck That Talked Back

Or: How Conversational AI Assistant Systems are revolutionizing knowledge worker professions around the globe by becoming actually useful for individuals.

Why does it work?

When doesn’t it work?

Enter: Conversational AI Assistant Systems

All good, then?

What’s next?

Ulrich Kautz

Projects