No Language Model

Jul 7, 2023

The latest tech fad appears to be generative AI and ChatGPT in particular. As with many technologies, this fad has spawned a lot of concern. Some of these concerns are justified, while some are hyperbolic. Unfortunately most people have a very limited understanding of the technology involved, and that leads to fear, panic, and misuse. The press, which should provide unbiased information is instead amplifying panic, making rational discussion of the technology difficult.

Comic with two people, one is on a pile of data, and the other addresses him: 'This is Your machine learning system?' The other replies, 'Yup! You pour the data into this big pile of linear algebra, then collect the answers on the other side.' 'What if the answers are wrong?' asks the first person. 'Just Stir the pule until they start looking right.' Courtesy xkcd.com

When I first used ChatGPT it reminded me of a much simpler algorithm developed over 100 years ago, Markov chains. A Markov chain is a stochastic way of building a sequence of items using existing sequences. The next element in a sequence is chosen by looking at the last N members of that sequence. We then look for that same N sequence pattern in the existing sequences and randomly choose one. We then add the next item from that existing sequence to the sequence we are generating. To generate text we can create a sequence of characters and pick a group of source texts as our existing sequences. The surprising thing about this algorithm is it produces text that looks similar to the input texts, producing actual words, mimicking punctuation, and at first glance looking fairly passable.

ChatGPT is much more advanced, but the fundamentals of the process are somewhat similar. ChatGPT consumes source text and generates new text using a stochastic process based on the existing text. Neither algorithm has any understanding of the text they ingest or produce. Neither algorithm is creative or purposeful in producing the text. Neither algorithm is a true thinking machine, or general AI. The Markov chain obviously produces meaningless babbel, but ChatGPT is advanced enough to create gramatically correct sentences and paragraphs that do not wander topic to topic. This can make it seem like ChatGPT is producing objectively good content, but this is not true. Just like Markov chains, ChatGPT regurgitates pieces of the text it ingested, and even if what it ingests is vetted and true, it can get twisted into nonsense by the algorithm.

In order to explore this a bit, I decided to create a site which uses Markov chains to produce articles based on articles about ChatGPT and AI. The code is available to demystify the process, and the results though unreliable can sometimes be fairly amusing. Generative AI in all its forms is random iteration over human works. There is no soul, no creativity, and not thought involved. It is a remarkable tool, and I am sure it will find uses, but it is not the existential threat some people make it out to be.

Tags: #programming #ai #thoughts