Are small AI better for programming?

Introduction

I've been immersed in the world of computer programming since the era of 8-bit computers, witnessing its evolution firsthand. Back then, programming was about the intricacies of hardware, meticulous manipulation of bits and bytes and using GOTO. Over time, we got the various programming styles: functional, procedural, object oriented, using higher and higher level of abstractions, and now, we hear talks of its "end of life" due to the rise of AI.

What is "Programming"?

I do not really agree with this view. I firmly believe that the essence of programming will always remain relevant:

Computer programming is the task of understanding a problem and articulating, in an extremely clear way, how a computer system must operate to provide a solution to that problem.

Just as we transitioned from dealing with individual CPU instructions to working with abstract data types and classes, we are now advancing towards a paradigm that aligns even more closely with natural language. This evolution doesn't signify the end of programming; rather, it represents a transformation in how we conceptualize the problems we want to solve and how we articulate their solutions.

The role of AI

And for this, the role of AI, especially Large Language Models (LLMs), has become increasingly significant. My journey through this landscape has been one of exploration and discovery, focusing on how AI can enhance the programming process.
Many see, and use, AI as a sort of glorified autocorrect in coding, offering suggestions and corrections up to entire, fully developed, functions. That's a fair and productive way to use it but I believe that the true potential of AI in programming lies far beyond these basic applications.

I often use the term "RubberGrokking" to describe the process of conversing with an AI to progressively clarify and refine the problem at hand, and to elucidate the most appropriate solution. That poor yellow plastic duck that used to silently stare at us while hearing our rants, is now able to get back at us with feedback, hints, and, sometimes, even useful artifacts.

The role of AI in programming is to engage the developer in conversations that will help clarify the problem and find the proper solution.

Take a walk on the Small side

After the intial tests with large LLM, my intrigue led me to consider smaller, fine-tuned AIs. The idea stemmed from the consideration that while large AIs know a lot about a wide array of topics, such extensive knowledge might not be necessary for specific programming tasks and might just be a waste of time and money. I mean, GPT-4 may know a lot about kittens, but when will I ever need that knowledge?

Using compact models promises several advantages: they're faster during inference, easier to fine-tune, and could potentially be hosted locally, addressing privacy concerns. Rather than using a single large model to rule them all, I could have many smaller, custom tailored models.

Initially, this smaller AI seemed like the perfect solution. But as I delved deeper, integrating both small and large models into my work, I could appreciate the major drawback: it's hard to get an adequate return on the time (and money) spent to create your perfect fitting model. By the time you've created the model you need, something else popped up: a new LLM, a new tool, a new need to fulfill.

The advantage of being heavy-weight

Besides the advantage of being just one API call away, large models showed me something I was not properly considering: the value of unrelated knowledge. Solving real world problems, like programming is meant to do, requires quite some knowledge of the world: we don't need just technical knowledge to do our job, we also need to know, depending on the type of application we are developing, what a mortgage is, what a grocery store is, and the fact that kittens will make your video tons on view on YouTube.

Large models, with their vast database of information, allow for exploring solutions on a higher level. They enable us to draw parallels and insights from seemingly unrelated fields. This broad knowledge base provides a rich context for problem-solving in programming.

Relying solely on a smaller model could mean constantly fine-tuning or retraining it to incorporate a wide range of knowledge. The effort involved in this process made me currently gravitate more toward larger models.

Conclusions (and moving forward)

With all that said, the idea of harnessing the benefits of smaller models (faster inference, privacy, …) still lingers in my mind. Maybe a properly trained model with the right balance of technical and world knowledge could be used as a base tool and then supplemented with the proper knowledge to handle the problem at hand.
Many research efforts are aimed at integrating Large Language Models (LLMs) with more structured representations of knowledge, such as Ontologies. This integration may bring new opportunities to re-evaluate the use of small models. Alternatively, in the meantime, some major breakthroughs could completely change our perspective on what 'small' and 'large' mean in this context.
My exploration into the optimal use of small versus large AI models in programming is still ongoing, and it's an exciting time to be part of this technological revolution. The field of AI-assisted programming is evolving rapidly and while traditional programming isn't disappearing, its future iteration will undoubtedly be transformed by AI, possibly in ways we can't yet fully comprehend.

Blog