The Future of Artificial Intelligence
210 views | +0 today
Follow
Your new post is loading...
Your new post is loading...
Scooped by Juliette Decugis
Scoop.it!

Mamba (Transformer Alternative): The Future of LLMs and ChatGPT?

Mamba (Transformer Alternative): The Future of LLMs and ChatGPT? | The Future of Artificial Intelligence | Scoop.it
The article discusses the emergence of a non-attention architecture for language modeling, in particular Mamba, which has shown promising results in experimental tests. Mamba is an example of a state-space model (SSM). But what is a state-space model? State-Space Models (SSMs) State-space models (SSMs) are a class of mathematical models used to describe the evolution of […]
Juliette Decugis's insight:

Following on the successes of state space models (S4 and S6), Albert Gu and Tri Dao introduce Mamba (article), a selective SSM which scales linearly with sequence lengths. It achieves state-of-the-art results in language, audio, and genomics compared with same-size and even larger Transformers. Rather than using attention mechanisms for context selection, Mamba relies on state spaces. Unlike previous SSMs, its efficiency and specific GPU design promises speed ups in large scale settings.

No comment yet.
Scooped by Juliette Decugis
Scoop.it!

What’s new with GPT-4 — from processing pictures to acing tests

What’s new with GPT-4 — from processing pictures to acing tests | The Future of Artificial Intelligence | Scoop.it
OpenAI’s latest AI language model has officially been announced: GPT-4. Here’s a rundown of some of the system’s new capabilities and functions, from image processing to acing tests.
Juliette Decugis's insight:

Since the rise of ChatGPT, the world has been waiting for GPT-4, released only two weeks ago!

 

This articles highlights the new abilities of the model: process images as well as text, better language understanding and enhanced error correction.

 

These improvements do come at the cost of a bigger model: GPT-4 has 100 trillion parameters approximately 500 times the size of its predecessor GPT-3. The development of larger and larger language models has raised concerns for the future and ethics of AI, is it safe to deploy models we don't fully understand?

No comment yet.