AI Demystified – understanding generative AI in Simple Terms

Published on 10 May 2024 at 08:11

Generative AI, or genAI, has become a buzzword that’s hard to miss. It’s being discussed not just among tech enthusiasts but in everyday conversations, family discussions, the news and even in the halls of Congress. But what exactly is generative AI? Let’s break it down into simpler terms for a better grasp without delving into the complexities of a PhD thesis.

Generative AI is all about prediction.  Given some text, predict what comes next.  If you have not experienced it or tried it yet check out chat.bing.com it’s become one of my favorites tho certainly not the only one out there.  My first experience with genAI was at Adobe Firefly.  As you can see you can ask genAI to create an image, a lesson plan, help with vacation planning, answering questions and a whole lot more.

Let’s take a little peak behind the scenes and see how all this works.

The first part is the collection of information, and a lot of it.  Think: encyclopedias, public domain books, magazine articles, web archives (all the stuff on the web),  any and everything that the model builders can get.  You may hear the term corpus associated with the data that is used to build the model, the word refers to a complete set of information on a topic.

All this information is processed by a computer program.  A rather complex one and one that isn’t entirely understood.

The processing of all this information is called training and the computer program that does this training is often done by programs called machine learning, neural networks, deep learning, and natural language processing. 

 

The training of all this information results in something called a large language model (LLM).  This LLM is a set of data, this data represents patterns, grammar, semantics, knowledge reprsented by billions of parameters (data). In addition to that data there is some software, again a neural network type of software that processes your question and gives you an answer.  Creating an LLM is no simple task, it takes alot of computing power which translates into a lot of money.  Estimates of 50million to 100million to train some of the large models have been mentioned in various talks.  However, the big tech companies don’t divelge the details or the exact figures.

Training, how does that work

    • All that data, the corpus, for the model is coverted (encoded) into numbers the number for each word is called a token and a series of tokens is called a vector(s). Computers can work with numbers, vectors a whole lot faster than they can with text.
    • Those vectors are fed into a computer program a neual network of computing. Where things like patterns and semantics are determined. 
    • The result of all that information processing is a pre trained LLM.

 

 

Using the Model:

Using a web based program, desktop app, mobile app or API (where programmers write code to use the Model) :

  1. You input a prompt and that prompt is translated into tokens, remember vectors ie numbers because it’s easier for the computer.
  2. The neural network program processes those vectors and computes probabilities using statistical patterns learned during training to predict the next word.
  3. The output is generated and decoded into the sequence of words that the model predicts or other objects like images, sound, video.  Decoding is the reverse of encoding, it’s where the program takes the numbers and converts it back to text or other formats for consumption by the user (text, images, sound).

 

A bit on neural networks: 

The use of these terms come from our basic understanding of how the brain works.  One of the objectives of AI is to think like the brain thinks.  We have learned a lot about the brain and how it works.  We talk about neurons firing and communicating with each other, regions in the brain and so on.  But we don’t fully understand how the brain works.

The neural network is ofter represented by diagrams that look similar to this one. They talk about nodes, input nodes, hidden nodes, output nodes, layers and layers upon layers.  Parameters, weights, and other things that can be tweaked by the model builders on the nodes.  Just like the brain we don’t fully understand how these neural networks work.  It seems as much art and craft building as the model builders tweak a model as it is science.  

 

How to use generative AI
A few of the ways in which to use generative AI is through some of the big tech web sites:  Gemini (google.com)Microsoft Designer – Stunning designs in a flashMicrosoft Copilot in BingPhindAdobe FireflyMeta AI,GitHub Copilot · Your AI pair programmer, and of course ChatGPT .
 

Conclusion

While there are loads of post, stories, and conspiracy theories floating around, AI isn’t at the point of taking over the world.  The output from these programs can seem very human, assertive, and confident but remember the output is all a prediction based on all the input and various knobs and switches the modelers have selected.  There are real risk such as inaccurate output that a person relies on as truth.  Built in bias based on the bias of the data that is used to train the model.  GenAI can be very powerful bringing creativity to the masses, improving our productivity, and serve as a great reference for information.  We all have to use our good brain and and put a filter on what comes out of these models and use the tools and information responsibly. 

Note: images, some text, and research for this post were created from generative AI.

 

Terms – a few terms related to Gen AI that you may find useful.

  1. Hallucinations: In the context of AI, hallucinations refer to unexpected or erroneous outputs generated by neural networks or other AI models during tasks like image generation or text completion. These outputs may not align with the intended patterns or data and can be considered “hallucinatory.”

  2. LLM (Large Language Model): LLMs are powerful natural language processing models that use deep learning techniques. They can generate human-like text, perform language translation, answer questions, and more. Examples include GPT-3 and BERT.

  3. Encoding: Encoding involves converting data (such as text, images, or audio) into a format suitable for processing by an AI model. It represents information in a structured way, making it easier for the model to learn patterns and make predictions.

  4. Decoding: Decoding is the reverse process of encoding. It involves converting model-generated representations (such as hidden states or embeddings) back into meaningful output, such as translated sentences or reconstructed images.

  5. Neural Network: A neural network is a computational model inspired by the human brain’s interconnected neurons. It consists of layers of interconnected nodes (neurons) that process input data and learn to perform tasks like classification, regression, or pattern recognition.

  6. Fine-Tuning: Fine-tuning involves adjusting a pre-trained neural network on a specific task or dataset. Instead of training from scratch, fine-tuning starts with existing weights and adapts them to the new task, improving performance.

  7. Prompt Engineering: Prompt engineering refers to designing effective input prompts or instructions for AI models. Well-crafted prompts can guide the model’s behavior and improve its output quality in tasks like language generation or question answering.

  8. Self Training – most of the actual training done by a LLM is done via self training, for example: the program will take a sentence chop off the last word and then feed that sentence into the model and predict what the last word would be scoring it’self along the way.  Then it will take the last two words off and continue the process of chopping words off. 

  9. Supervised Learning: is a method of training an LLM by teaching the  model by exposing it to labeled data (input-output pairs) that are known to be true.  Humans often create these labeled pairs and will often do with a specific use case in mind. During training, the model adjusts its internal parameters (weights) to minimize the difference between its predictions and the ground truth labels.

  10. Corpus: “3 a : all  the writings or works of a particular kind or on a particular subject especially : the complete works of an author.” Merriam-Webster

 

Add comment

Comments

There are no comments yet.