Featured

Transfer Learning with Transformer Neural Networks

Transformer neural networks have revolutionized the field of natural language processing (NLP).

 

Instead of training large models from scratch, developers can now leverage transfer learning with pretrained transformer models to perform complex NLP tasks quickly and efficiently. In this post, we’ll explore how Hugging Face’s Transformer library makes it easy to implement text classification, sentiment analysis, summarization, and more—all with just a few lines of code.

 

Key NLP Tasks Solved by Transformers

Some typical tasks in the field of natural language processing (NLP) are described in the following list:

  • Text classification: Assignment of a text to one or more predefined categories, such as spam detection or sentiment analysis.
  • Machine translation: Automatic translation of texts between different languages.
  • Text summary: Automatic creation of a shorter version of a text that contains the essential content.
  • Question answering: Answering questions based on a trained knowledge base.
  • Text generation: Automatic creation of new texts based on prompts.

Using the transformer library from Hugging Face, we now demonstrate how we can use pretrained transformer networks for different tasks.

 

Hugging Face: The Hub for Pretrained Models

Hugging Face is a company and an open source platform that originally specialized in NLP technologies. Today, Hugging Face is best known for its extensive library of pretrained models for all kinds of AI tasks, as well as tools for developing and implementing AI solutions. The main goal is to make the use of modern machine learning models more accessible to researchers and developers.

 

On this platform, you can find the following (and much more):

  • A collection of datasets for training neural networks
  • A central repository with thousands of pretrained models for various tasks
  • A platform for trying out applications (here, referred to as spaces) with the help of user-friendly interfaces 

Getting Started with the Transformer Library

Before you start, you must install the necessary libraries. You should already be familiar with the installation in the Anaconda environment and in a Google Colab. In this listing, we specify the installation command in a Jupyter Notebook or Google Colab cell.

 

!pip install transformers datasets evaluate accelerate

 

Example: Sentiment Analysis in Action

To access one of the models for a specific NLP task, we create a pipeline() instance and define the type of task. In the listing below, we show an example that performs a text classification—more precisely, a sentiment analysis. The aim is to classify a text in terms of its positivity or negativity.

 

from transformers import pipeline

sentclassifier = pipeline("text-classification")

result= sentclassifier("I am very happy with this book on Neural Nets!")

print(result)

# Output:

([{'label': 'POSITIVE', 'score': 0.9998326301574707}]

 

We’ve only specified the task here; the pipeline() function then uses a default transformer network, which is a BERT transformer in this example: distilbert/distilbertbase-uncased-finetuned-sst-2-english. However, this model only works for the English language.

 

You’ll also find models for other languages on Hugging Face, which you’ll then have to specify in more detail, as shown in this listing.

 

from transformers import pipeline

sentclassifier = pipeline("text-classification", model=" nlptown/bert-basemultilingual-

uncased-sentiment",device="cuda")

result= sentclassifier("I am very happy with this book on neural networks!")

print(result)

# Output:

[{'label': '5 stars', 'score': 0.8118821382522583}]

 

You can see that the output is slightly different, but the text is still rated positively. In addition, we’ve specified the device="cuda" parameter in the definition of sentclassifier so that the function also uses the GPU for sentiment analysis.

 

Of course, it’s also possible to transfer multiple sentences at the same time, as shown here.

 

from transformers import pipeline

sentclassifier = pipeline("text-classification", model="nlptown/bert-basemultilingual-

uncased-sentiment",device="cuda")

results = sentclassifier(["I am very happy with this book on Neural Nets!", "I

don't like the topic at all!"])

print(results)

# Output:

[{'label': '5 stars', 'score': 0.7789900898933411}, {'label': '1 star', 'score':

0.673673152923584}]

 

Tokenization and Model Internals

Let’s now take a closer look at the internal processes. The first step is to transform text into tokens or token IDs. In the previous examples, this happens automatically, but now we want to see exactly what happens here in this listing.

 

from transformers import pipeline

from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_name = "nlptown/bert-base-multilingual-uncased-sentiment"

model = AutoModelForSequenceClassification.from_pretrained(model_name)

tokenizer = AutoTokenizer.from_pretrained(model_name)

sentclassifier = pipeline("sentiment-analysis", model=model, tokenizer=

tokenizer)

 

We use the same model as shown earlier and take a look at what a token list looks like. The AutoTokenizer() function provides us with the tokens as it runs through the model. It’s important to note that all models have their own tokenizers and that these are usually different, as shown below.

 

sequence = "I still like this book about neural networks!"

# Transformation to tokens

tokens = tokenizer.tokenize(sequence)

print("Tokens: ", tokens)

# Transformation in IDs

ids = tokenizer.convert_tokens_to_ids(tokens)

print("IDs: ", ids)

# Back-transformation of IDs into a sentence

decode_ids = tokenizer.decode(ids)

print("IDs for sentence: ", decode_ids)

# Output:

Tokens: ['I', 'still', 'like', 'this', 'book', 'about', 'neural', 'networks',

'!']

IDs: [151, 12440, 11531, 10372, 11768, 10935, 86165, 28310, 106]

IDs for sentence: i still like this book about neural networks!

 

Here, you can see that not every word is in fact a token, but that some words are subdivided. Each token has its own ID, which is ultimately used. It’s also interesting that both the tokens and the back-transformed sentence are in lowercase. Why is that? Well, if we look at the name of the model used, bert-base-multilingual-uncased-sentiment, we read uncased, which means case-insensitive.

 

Exploring the Hugging Face Model Hub

So how do we find the right model for our NLP tasks? To get started, we want to find a model that summarizes a text, so go to https://huggingface.co/models to get an overview of all the models Hugging Face makes available to us.

 

Model Selection on Hugging Face

 

Then, we select the Summarization task under Tasks (shown in the bottom left of the above figure) and the English language under Languages. Let’s now summarize some text from the book Programming Neural Networks with Python.

 

We can select any model from the model selection. Models that have already been downloaded several times are ideal. This often indicates acceptable quality. We therefore select the facebook/bart-large-cnn model, which is based on the facebook/bartbase model and has been customized with a special dataset (shown below).

 

from transformers import pipeline

summarizer = pipeline("summarization",model="facebook/bart-large-cnn", device=

"cuda")

text = "Having acquired the theoretical knowledge of Convolutional Neural Networks and Transformer Neural Networks in the previous chapter, we will now implement the tools using TensorFlow and the integrated Keras library. We first created our own network model to generate a classifier for the MNIST dataset. We then used (very complex and powerful) pre-trained Deep Neural Nets to apply them to tasks. On the one hand, we used a Convolutional Neural Net called Inception-v3, which is already very good at extracting essential features from an image. Secondly, we used the Hugging Face library to solve typical NLP tasks. This approach, also known as transfer learning, saves us the timeconsuming training of our own complex network."

 

summary = summarizer(text, min_length=10, max_length=100)

print(summary)

# Output:

Device set to use cuda

[{'summary_text': 'We first created our own network model to generate a

classifier for the MNIST dataset. We then used (very complex and powerful) pretrained

Deep Neural Nets to apply them to tasks. This approach, also known as

transfer learning, saves us the time-consuming training of our own complex

network.'}]

 

Remember that the output of transformer neural networks doesn’t always have to be the same. So, you should expect to receive a different summary of the text.

 

Conclusion

Transfer learning with Hugging Face Transformers allows developers to solve NLP tasks with minimal effort while still achieving state-of-the-art results. By using pretrained models, you save time, resources, and computing power while unlocking the ability to perform sentiment analysis, translation, summarization, and more. Whether you’re just starting out or refining advanced applications, Hugging Face’s ecosystem makes cutting-edge NLP accessible to everyone.

 

Editor’s note: This post has been adapted from a section of the book Programming Neural Networks with Python by Joachim Steinwendner and Roland Schwaiger. Dr. Steinwendner is a scientific project leader specializing in data science, machine learning, recommendation systems, and deep learning. Dr. Schwaiger is a software developer, freelance trainer, and consultant. He has a PhD in mathematics and he has spent many years working as a researcher in the development of artificial neural networks, applying them in the field of image recognition.

 

This post was originally published 9/2025.

Recommendation

Programming Neural Networks with Python
Programming Neural Networks with Python

Neural networks are at the heart of AI—so ensure you’re on the cutting edge with this guide! For true beginners, get a crash course in Python and the mathematical concepts you’ll need to understand and create neural networks. Or jump right into programming your first neural network, from implementing the scikit-learn library to using the perceptron learning algorithm. Learn how to train your neural network, measure errors, make use of transfer learning, implement the CRISP-DM model, and more. Whether you’re interested in machine learning, gen AI, LLMs, deep learning, or all of the above, this is the AI book you need!

Learn More
Rheinwerk Computing
by Rheinwerk Computing

Rheinwerk Computing is an imprint of Rheinwerk Publishing and publishes books by leading experts in the fields of programming, administration, security, analytics, and more.

Comments