Artificial Intelligence and large language models

There will be no GPT-5 — Here is why!

A speculative approach to look behind the metrics of GPT models.

Moritz Kross
3 min readAug 28, 2023

--

Foto von Mojahid Mottakin auf Unsplash

Let’s be honest… After last year being the year of text-to-image generation with Dall-E, Midjourney, and Stable Diffusion 2023 is definitely the year of large language models for generating texts.

OpenAI took the market by storm with GPT-3.5 and its successor GPT-4. All big tech companies strive to bite a piece out of the cake and stumble over the problems of being second place in the race.

Let's do some stats…

GPT-3 has proven to be a game changer with its 175 billion parameters, 45 TB training data, 700GB RAM for full precision training, and a cost of “only” 4.6 million US$. Earlier this year GPT-4 was announced and to quote Sam Altman, the CEO of OpenAI, its costs were around 100 million US$. According to some sources the parameters are somewhat around 1.76 trillion.

Let's add some math...

So, if we assume GPT-5 would comply with the growth of the previous models it will have more than 17 trillion parameters and the costs will increase by a factor 20 up to 2 billion US$. This is a huge amount of capital to invest for a business case that you cannot exactly predict to be successful.

Let's think about some inherent problems...

Larger models result in the following:

  • Much more hardware is needed
  • Energy consumption will increase heavily
  • Training time will increase dramatically

And now think about the use cases that will arise. Do you think all the effort will result in totally new features we cannot imagine right now or that aren’t solved by specific models at the time?

OpenAI’s net worth is 29 billion US$ as stated by different sources. So they might have the capital to continue the development of even larger models. Still, it is questionable if they will continue on their path. Nevertheless, all the big tech companies try to achieve the unachievable: real artificial general intelligence.

Let's make an assumption

--

--

Moritz Kross

Senior Fullstack Software Engineer (.Net, Java, Python) and DataScientist