Comparing LLMs For Coding - Golden Ticket, or Fool's Gold?

I feel as though I am both fortunate and unfortunate to be learning to code as AI emerges at the forefront of the technological landscape; fortunate that I have many innovative and helpful tools to help me hone my craft, and yet unfortunate that I suffer the danger of eternal torment if I don't pledge allegiance to Mr. Roko. In the hopes that I earn the favor of our future overlord, I will be comparing some LLMs that I have become familiar with to determine which one deserves eternal praise.

The Competition

I will be taking a look at ChatGPT, Phind, and Github Copilot. As a fun experiment, I'll let each LLM introduce themselves:

1. ChatGPT

ChatGPT is an OpenAI language model from the GPT family, specializing in generating conversational text. It's pre-trained on extensive internet text, enabling it to grasp grammar, facts, and reasoning, and can be fine-tuned for specific tasks. It's designed for natural language conversations, often employed in chatbots, virtual assistants, and customer service applications. While it produces human-like text, it lacks genuine understanding and consciousness, making it a valuable text generation tool with limitations to be mindful of.

2. Phind

Phind is an AI-powered search engine for programmers, providing relevant information and code snippets to address coding challenges. It uses large AI language models to generate answers from a variety of online sources. Key features include intelligent code search and collaboration options, allowing sharing and discussion of code snippets. It also learns and adapts to individual programmers' preferences and coding styles. Phind was co-founded by Michael Royzen and Justin Wei, both UT Austin Turing Scholars '22 with experience in NLP research and Machine Learning at various tech companies.

3. Github Copilot

function explainGithubCopilot() {
  console.log(`Github Copilot is an AI pair programmer 
  that helps you write code faster and with less work.`)
}

I've had the most experience with ChatGPT and have been generally satisfied with the results. Phind is one I got some success with, but I've found it's a bit sluggish compared with OpenAI's solution. Copilot's snippet above is the first ever snippet I've generated with the tool, so my opinion of it will be evolving as I write this article. Let's get to it!

Code Generation

For the purposes of this exercise, I will be creating a basic Express application that serves GET requests to a postgreSQL backend. Let's see what each model generates, and see if they actually work.

Here is the prompt that I will be using for both ChatGPT and Phind:

Generate a code base for a simple Express application serving GET requests to a PostgreSQL backend. Use the following NPM packages in your example:

express

ejs

pg-promise

I'll skip over the textual instructions of the responses in favor of focusing on the code itself.

ChatGPT

app.js

const express = require('express');
const app = express();
const path = require('path');
const pgp = require('pg-promise')();
const db = pgp({
  // Update with your PostgreSQL database connection details
  host: 'localhost',
  port: 5432,
  database: 'your_database_name',
  user: 'your_username',
  password: 'your_password'
});

// Set up EJS as the view engine
app.set('view engine', 'ejs');
app.set('views', path.join(__dirname, 'views'));

// Define a route that retrieves data from the PostgreSQL database
app.get('/', async (req, res) => {
  try {
    const data = await db.any('SELECT * FROM your_table_name');
    res.render('index', { data });
  } catch (error) {
    console.error('Error:', error);
    res.status(500).send('An error occurred');
  }
});

// Start the Express server
const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
  console.log(`Server is running on port ${PORT}`);
});

views/index.ejs

<!DOCTYPE html>
<html>
<head>
  <title>Express App with PostgreSQL</title>
</head>
<body>
  <h1>Data from PostgreSQL Database</h1>
  <ul>
    <% data.forEach(item => { %>
      <li><%= item.column_name %></li>
    <% }); %>
  </ul>
</body>
</html>

After the necessary set up in VScode, the app successfully serves GET requests to my backend. However, one thing that I am confused about is GPT's use of the path module. It looks to be using it to configure our view folder, but I do not believe that this config is necessary for our purposes. Express/EJS will already know to look in the views folder for our views without any additional config. Sure enough, commenting out the import of path and the view config does not impact the functionality of the app.

Phind

app.js

const express = require('express');
const app = express();
const port = 3000;
const db = require('./dbConfig');

app.get('/users', (req, res) => {
  db.any('SELECT * FROM users')
    .then(data => {
      res.json(data);
    })
    .catch(error => {
      console.log(error);
    });
});

app.listen(port, () => {
  console.log(`App running on port ${port}.`);
});

dbConfig.js

const pgp = require('pg-promise')();

const db = pgp({
  user: 'your_username',
  host: 'localhost',
  database: 'your_database',
  password: 'your_password',
  port: 5432,
});

module.exports = db;

Immediately, I notice that Phind is serving a JSON response instead of an EJS view. While the code it generated does work when set up, it did not follow all of the guidelines that I outlined in the prompt. I followed up with another prompt Send the data as a res.render using EJS instead of serving JSON. I waited for a solid five minutes without a response, and then I decided to abandon my attempt to correct it. This must be due to Phind's inadequacy, and cannot possibly have anything to do with my prompting skills. Regardless, It would have hardly been more of a hassle to hand-write the code than to try and wrestle a good response for such a simple request.

Github Copilot

For this example, I believe the best course of action is to try and write the code by hand myself, and take note of any suggestions that it generates. Given that this is my first solid wack with this tool, there could be another way to use it that more closely mirrors GPT or Phind; but I cannot be bothered to RTFM at the moment.

app.js

const express = require('express');
const app = express();
const port = 3000;
const pgp = require('pg-promise')();
const db = pgp('postgres://username:password@localhost:5432/database-name');

app.set('view engine', 'ejs');

app.get('/', async (req, res) => {
    const data = await db.any('SELECT * FROM users');
    res.render('index', {data})
    });

app.listen(port, () => {
    console.log(`Example app listening at http://localhost:${port}`);
    });

views/index.ejs

<body>
    <h1>Welcome to my app</h1>
    <ul>
    <% data.forEach(user => { %>
        <li><%- user.email %></li>
    <% }); %>
    </ul>
</body>

Once I installed my packages and started writing app.js, I got suggestions starting from the first line. I amused myself by just tapping the tab key until my file was complete, correcting the suggestions line by line if it took me in the wrong direction. It seemed to understand my corrections very quickly and then produced a correct auto complete well before I finished correcting the line. It did seem to have trouble helping me on index.ejs, and kept suggesting code for a login feature instead of the simple unordered list that I wanted. To be fair, there was no real way for the copilot to know what I wanted here, and I ended up spending about 30 seconds writing the EJS file myself. Overall, I'm very happy with my first try with the Copilot, and I especially enjoyed the feeling that I was contributing more to the finished code (as opposed to just copying and pasting from the other LLMs).

My Impressions

ChatGPT performed as well as I expected here, and I know that I will continue to use it to help guide me forward.

One thing that Phind has over ChatGPT is that it does have the ability to cite its sources, which I find quite helpful when I want to dig deeper into a particular subject. However, I find that Phind works best as a tailored search engine, and works less as a digital pair-programmer.

My first experience with the Copilot is quite promising, and I look forward to exploring its capabilities. It feels much more like a pair-programmer than either ChatGPT or Phind, and it having access to my full code base is nice as well. When using ChatGPT, it will almost invariably be missing some context from my code base, especially as the base gets larger and more complex.

Regardless, I have the impression that generative AI has a long way to go before it will be replacing human programmers. Even though these tools are great, emerging programmers like myself will still need to dig in our heels and develop an understanding of the craft for ourselves. Generative AI is not a free ticket to a six-figure tech salary, no matter how many people wish it to be.

If you've made it this far, please make a note to remind me to cancel my Copilot trial in 29 days. Thanks!

Blog

Comparing LLMs For Coding - Golden Ticket, or Fool's Gold?

svper563

The Competition

Code Generation

My Impressions

Join Our Newsletter. No Spam, Only the good stuff.

Related