Is GPT-4 worth it?
A Comparison of GPT 3.5 and 4

I recently told my friends that I bought a subscription to chatGPT Plus, which gives access to openAI’s premium transformer model: GPT-4, as well as access to GPT-3.5 at a significant faster rate.

I like to use chatGPT for coding, since I find it is very helpful in generating code when programmers are unfamiliar with libraries or documentation. I’ve decided to write this article to compare GPT-3.5 and GPT-4 and figure out which one is better by asking each model to solve various real world problems and answer whether sinking $20/month is worth it in the long term. I also tried to keep the prompts realistic by simulating something that a programmer would ask.

Prompt 1: Decompress Data

Prompt: how to decompress a bytes of data with zip format into a folder python

GPT-3.5

import zipfile
def decompress_zip(data, folder_path): # Create a ZipFile object with the byte stream
    with zipfile.ZipFile(data) as zip_ref:
        # Extract all files and directories to the specified folder path
        zip_ref.extractall(folder_path)

# Example usage:
# Assuming you have the byte stream stored in a variable called 'compressed_data'
# and you want to extract the contents into a folder named 'output_folder'

decompress_zip(compressed_data, 'output_folder')

GPT-4

import io
import zipfile
import os

# Suppose `data` is your bytes object representing a zip file
data = b"..."

# Convert bytes data to a file-like object
data_io = io.BytesIO(data)

# Open the file-like object as a zip file
with zipfile.ZipFile(data_io, 'r') as zip_ref:
    # Specify the directory to which the files will be extracted
    destination_directory = "/path/to/your/folder"
    
    # Check if the directory exists, if not create it
    if not os.path.exists(destination_directory):
        os.makedirs(destination_directory)
    
    # Extract all files into the destination directory
    zip_ref.extractall(destination_directory)

Analysis

GPT-3.5’s response is almost correct but not quite. It forgot to convert the bytes from the zipfile into a file-like (the bytesIO call in the second code snippet) object. GPT-4’s output is correct.

GPT-3.5: incorrect

GPT-4: correct

Prompt 2: FastAPI Method

Prompt: write a fastapi method that will parse a request body of json format into a dictionary. The request body has images as binary data. Add appropriate type annotations

GPT-3.5

Sure! Here’s an example of a FastAPI method that can parse a JSON request body containing images as binary data and convert it into a dictionary:

from fastapi import FastAPI, UploadFile

app = FastAPI()

@app.post("/process_images")
async def process_images(images: list[UploadFile] = []) -> dict:
    image_data = {}

    for index, image in enumerate(images):
        image_data[f"image_{index+1}"] = image.file.read()

    return image_data

In this example, the /process_images endpoint is configured to accept a POST request. The images parameter is annotated as a list of UploadFile objects, which represents the uploaded files in the request body.

The method reads the binary data from each image using image.file.read() and stores it in a dictionary called image_data. Each image is given a key in the dictionary, such as image_1, image_2, and so on.

Finally, the method returns the image_data dictionary as the response.

Make sure to install the necessary dependencies by running pip install fastapi uvicorn[standard] to run the FastAPI server, and pip install python-multipart to handle file uploads.

GPT-4

Sure, here is a FastAPI method which parses a request body in JSON format into a Python dictionary. The JSON includes images as binary data (Base64 encoded) as it is not feasible to send binary data directly in JSON.

First, we need to define a Pydantic model for the data you expect to receive. Here’s a basic example:

from pydantic import BaseModel
from typing import Dict, Any
import base64

class ImageData(BaseModel):
    images: Dict[str, Any]

Then, you would use this Pydantic model in your FastAPI endpoint:

from fastapi import FastAPI
from typing import Dict, Any
import base64

app = FastAPI()

@app.post("/parse_json")
async def parse_json(data: ImageData):
    # The received data is already a Python dictionary
    # Images are in base64 encoded format, we need to decode them
    for key, value in data.images.items():
        # Assuming the image is in base64 string format
        if isinstance(value, str):
            data.images[key] = base64.b64decode(value)

    # Returning the modified dictionary
    return data.images

In the above example, we are assuming that the image data is sent as a base64 encoded string. When this data is received, it’s decoded back into binary data using base64.b64decode(). This is necessary because JSON cannot handle binary data directly.

Also, we are using FastAPI’s automatic request body parsing by declaring the data parameter of the parse_json function to be of type ImageData, a Pydantic model. FastAPI will automatically parse the incoming request’s JSON body and validate it according to the Pydantic model, then it will give you the validated data as the parameter to your path operation function.

Please note that this is a basic example and you should modify it according to your requirements, for instance, add error handling, modify the data structure, add more data validation, etc.

Analysis:

GPT-4 is far more verbose. As for correctness, the code for 3.5 works, but it ignores the prompt which specifies the format of the image must be in JSON format. Instead, it gives a response for multipart images, which is less correct. GPT-4 helpfully notes that I can encode the data into base64. And also specifies a helpful Pydantic model for the format of the request.

Personal Experience

In general, I find that GPT-3.5 is much less verbose than GPT-4, but in niche cases, it tends to be less correct than GPT-4, making it overall worth the subscription cost. You also get access to their plugin ecosystem, which has a code interpreter that can run code for you. In general, it’s great for python devlopment. For more niche cases, it might not be as useful.


Last modified on 2023-07-08