Small and Large Language Models: are SLMs Small LLMs?

Large language models are increasingly getting smaller variants focused on specific datasets. Does this make an SLM merely a slimmed-down version of an LLM, or is there more to it?

While Large Language Models (LLMs) like ChatGPT or Gemini show broad capabilities, attention is simultaneously growing for a smaller and more practical alternative: Small Language Models (SLMs). These are small language models trained on a specific dataset.

Arno van de Velde, Principal Solutions Architect Benelux at Elastic, calls such models “specialist language models”. What else makes these small models different from their larger counterparts? Van de Velde outlines the difference between an LLM and SLM and provides a glimpse of what small language models (still) have in store.

Specialist Language Model

We all broadly know how a Large Language Model (LLM) works, but what’s the difference with a Small Language Model (SLM)? That’s a question for which van de Velde has two answers ready. “The short answer to this question is that an SLM is a much smaller form of an LLM,” although he immediately adds that this definition doesn’t do full justice to an SLM.

An SLM is a more specific way of dealing with a certain subset of information.
Arno van de Velde, Principal Solutions Architect Benelux, Elastic

A better description of an SLM, according to van de Velde, is “a specific way of dealing with a certain subset of information”. He also calls this a specialist language model. “You train the model within a certain domain on specific information.”

Van de Velde cites the example of lawyers. “They use a very specific language to describe their domain, which creates a different type of language model specifically aimed at professional jargon.”

Fast and Efficient

The advantage of an LLM is that it can perform deeper reasoning and answer complex questions. This obviously requires more computing power as the model needs to perform various steps to provide an answer.

“When you ask a question to an SLM that fits within the specific domain the model is trained on, you get an answer presented in milliseconds,” explains van de Velde. An SLM is smaller, faster, and lighter, making the model suitable for direct tasks such as providing answers within a specific dataset.

Moreover, SLMs can run locally on a laptop or in environments without constant internet connection, making them attractive for applications in, for example, defense or industry. This is in contrast to an LLM, which needs continuous connection.

Small LLMs

LLMs are now found everywhere: in search engines, office software, customer services, but also mobile devices. For example, Google developed Gemini Nano, a miniaturized version of its LLM Gemini that runs locally on a smartphone. Does this then also fall into the category of an SLM, or is it more of a ‘miniaturized version of an LLM’?

Van de Velde finds that this line is becoming increasingly blurred. Nevertheless, he classifies such ‘small LLMs’ more as an SLM. In the case of Gemini Nano on, for example, the latest Pixel phones, there is a generative AI function that recognizes objects in the distance and completes them in a photo. “Such small models could rather be described as a specific SLM focused on images,” he states. These SLMs are thus focused on certain subtasks, and are not called a ‘small LLM’.

Do-it-yourself?

How easy is it then for a company or developer to create their own SLM? Van de Velde explains two different ways to create an SLM. “You can take an LLM as a starting point to distill information or components from there. In this way, you remove things that matter less from the model to arrive at a smaller and specialized version. Another way is to train a model from scratch on a specific domain.”

For many organizations, the added value lies not in fully training an SLM themselves, but in smartly deploying and adapting already available models.
Arno van de Velde, Principal Solutions Architect Benelux, Elastic

In both cases, however, there remains a great need for extensive datasets and substantial investments, something that most companies don’t have as standard. “Organizations therefore more often choose to refine existing compact models, of which Llama 3 is a well-known one, and combine them with additional functions or multiple small models side by side. “This makes it possible to build targeted solutions, such as faster access to internal information, automatic summaries, or support with customer files.”

“For many organizations, the added value therefore lies not in fully training an SLM themselves, but in smartly deploying and adapting already available models. Developing completely new models remains primarily the domain of large players with considerable research budgets,” states van de Velde.

Orchestrator

Van de Velde expects small language models to play a big role, especially in combination with the rise of AI agents. “Instead of one model that does everything, an LLM becomes more of an orchestrator that distributes various tasks: calling a piece of code, performing a hybrid search, using an SLM for a specific task, or engaging a dedicated tool.”

According to him, the biggest gain in the short term lies in small-scale, practical applications where small models are smartly combined to create direct added value.