Humans are very good in applying and transferring knowledge they gained in one task to another one. This means that whenever we encounter a new problem, we recognise it and apply our relevant knowledge from our previous learning experience to shortcut a solution.
For instance, if you know how to surf and if you try snowboarding for the first time, you will use your experience in keeping the balance, turning with your body, but also you’ll be faster to recognise how to spot good conditions (wave vs. snow type) to ride. This will make things easier for you as compared to a someone who is completely green in the field. Such leanings are very useful in real life as it makes us more agile and allows us to earn more experience. This ability to operate on a conceptual level makes our work easier and faster to finish. That’s something that machines were always falling behind… until now.
Following such thinking, Machine Learning scientists introducedĀ Transfer Learning. It involves the use of knowledge that was learned in some task, and apply it to solve the problem in the related target task. While most machine learning and AI is designed to address a single task (a narrow type of AI), the development of algorithms that facilitate transfer learning is a topic of ongoing interest, both in academic and business communities.
How conventional machine learning algorithms work
To speed things up for those who are just starting their journey with AI and Machine Learning – machine learning is the general term for when computers learn from data without being explicitly programmed, but rather, learn by being shown examples – just like children do. Machine learning algorithms recognise patterns in the data and make predictions once new data arrives.
Now, transfer learning is a technique that enables algorithms to learn a new task by using pre-trained models. Here is a simplified comparison of conventional machine learning and transfer learning:
Credit: untrite.com
In traditional learning, a machine learning algorithm works in isolation. When given a large enough dataset, it learns how to perform a specific task. However, when tasked to solve a new problem, it cannot resort to any previously gained knowledge. Instead, a conventional algorithm needs a second dataset to begin a new learning process.
In transfer learning, the learning of new tasks utilises insights from previously learned tasks. The algorithm is constructed in such way that it can store and access knowledge – it is “recycled”. The model is general instead of specific. Training new machine learning models can be resource-intensive, so transfer learning saves both resources and time.
How does transfer learning work?
In supervised machine learning, models are trained on labelled data to complete specific tasks during the development process. Input and desired output are clearly mapped and fed to the algorithm. The model can then apply the learned trends and pattern recognition to new data. Models developed in this way are highly accurate when solving tasks in the same environment as its training data. It will become much less accurate if the conditions or environment changes in real-world application beyond the training data. The need for a new model based on new training data may be required, even if the tasks are similar.
Transfer learning approach takes parts of a pre-trained machine learning model and applies it to a new but similar problem. This will usually be the core information for the model to function, with new aspects added to the model to solve a specific task. Programmers just need to identify areas of the model that are relevant to the new task, and which parts need to be retrained. For example, a new model may keep the processes that allow the machine to identify objects or data, but retrain the model to identify a different specific object.
Benefits of transfer learning for machine learning
Transfer learning brings a range of benefits to the development process of machine learning models. The main benefits of transfer learning are saving of resources and improved efficiency when training new models.
The accurate labelling of large datasets takes a huge amount of time. The majority of data encountered by organisations can often be unlabelled, especially with the extensive datasets required to train a machine learning algorithm. With transfer learning, a model can be trained on an available labelled dataset, then be applied to a similar task that may involve unlabelled data.
Transfer learning also increases learning speed. With fewer new things to learn, the algorithm is faster to generate high-quality output. To use an analogy, an snowboarder is likely to learn more quickly to balance on the waves while surfing than an average person because certain concepts apply to both disciplines.
Transfer learning in natural language processing
Natural language processing allows machines to “understand” and analyse human language and its context, whether through audio (through speech to text) or text files. It opens unlimited ways for increased work productivity and wellbeing, improving how humans and computers interact. Natural language processing is intrinsic to everyday services like voice assistants, speech recognition software, automated captions, translations, documents processing or intelligent search and document contextualisation likeĀ Untrite AI.
Transfer learning is used in a range of ways to strengthen machine learning models that deal with natural language processing. Examples include simultaneously training a model to detect different elements of language, taxonomy, or embedding pre-trained layers which understand specific dialects or vocabulary; e.g. it can suggest company internal knowledge acquired in similar projects e.g in R&D and apply that knowledge to a new project, let’s say in sales. Transfer learning can also be used to adapt models across different languages. Aspects of models trained and refined based on the English language can be adapted for similar languages or tasks.