The source_dir and output_path attributes of the Hugging Face Estimator define paths to the source directory of the training script (as a tar.gz file) and model outputs respectively.
#Milkytracker import fine tune install#
We use SageMaker studio with a Python 3 (PyTorch 1.6 Python 3.6 CPU Optimized) kernel to run our code, and install the required packages on the first line as follows: !pip install 'sagemaker>=2.48.0' 'transformers=4.9.2' 'datasets=1.11.0' -upgrade Configure estimator source and output
#Milkytracker import fine tune how to#
This post contributes a description of how to modify the above example to train multiclass categorisation models in SageMaker using CSV data stored in S3.
![milkytracker import fine tune milkytracker import fine tune](https://d2250zc18i5qvg.cloudfront.net/rails/active_storage/representations/proxy/eyJfcmFpbHMiOnsibWVzc2FnZSI6IkJBaHBBcklCIiwiZXhwIjpudWxsLCJwdXIiOiJibG9iX2lkIn19--ebaa26240138ad0fc67be64533b3b4eee53d53bf/eyJfcmFpbHMiOnsibWVzc2FnZSI6IkJBaDdDRG9MWm05eWJXRjBTU0lJY0c1bkJqb0dSVlE2QzNKbGMybDZaVWtpRHpFeU1EQjRNVEl3TUQ0R093WlVPZ3huY21GMmFYUjVTU0lMUTJWdWRHVnlCanNHVkE9PSIsImV4cCI6bnVsbCwicHVyIjoidmFyaWF0aW9uIn19--a54c4c38eea8318b535946735fd5ab97e1ca65e9/img_0060.png)
What we struggled to find was an example showing how to fine-tune for multiclass categorisation with a custom dataset.
![milkytracker import fine tune milkytracker import fine tune](https://www.shapeways.com/blog/wp-content/uploads/2017/01/skanect-2.png)
The family of transformer-based models 8 achieve the current state-of-the-art performance 9 in tasks such as machine translation, named-entity recognition, question answering and sentiment analysis. BERT has been instrumental in the adoption of transfer-learning for natural language processing in the same way as ImageNet 7 for computer vision. The representations are learned by pre-training on very large text corpora and can be used as inputs when learning to perform various downstream tasks – a process referred to as fine-tuning. Fine-Tuning BERT for multiclass problemsīERT is an approach for constructing vector representations of input natural language data based on the transformer architecture 6.
![milkytracker import fine tune milkytracker import fine tune](https://64.media.tumblr.com/340787e798c4c04460c06d8e8f198275/f2897d60c863e397-84/s1280x1920/daac7255bde2d6711bf1a7fef3db1278c1eb6e27.jpg)
The work described was carried out together with our summer intern, Harry Tullett. In this post we discuss how we made use of the Hugging Face transformers library 5 to fine-tune a BERT model to categorise our bank transactions. Adding more target classes to the model therefore increases the number of large matrix multiplications required to serve the predictions, affecting model latency, and also the space required to hold the model object in memory.įor this reason, as well as curiosity as to whether vector representations based on an attention mechanism 3 will perform better than bag-of-words, we were keen to explore using a BERT 4 model for transaction categorisation. The number of parameters we have to store is pretty much equivalent to our vocabulary size multiplied by the number of target classes when we train the model with a one vs. This issue is that the vocabulary learned when vectorizing the training data transactions, which considers both unigrams and bigrams, can reach the 10s of millions.
![milkytracker import fine tune milkytracker import fine tune](https://i.pinimg.com/originals/9e/c7/89/9ec789d4db0ff65bd2fbedb321316760.jpg)
This approach is simple both in terms of the preprocessing and modelling, and has been running on the smallest real-time inference instance SageMaker has to offer, so costs have been low! However we discovered a scalability issue that has been hampering us for some time. chapter 12 of the elements of statistical learning 2). ‘Costa Coffee Edinburgh’) vectorized by using a tf-idf bag-of-words approach 1, and the model itself is a linear support vector machine (e.g. The model inputs are bank transaction descriptions (e.g. When FreeAgent customers import their bank transactions, we predict which accounting categories the transactions belong to by making requests to a machine learning model managed with Amazon SageMaker.