Python Transformers By Huggingface Hands On: 101 practical implementation hands-on of ALBERT/ViT/BigBird and other latest models with huggingface transformers
Python Transformers By Huggingface Hands On
101 practical implementation hands-on of ALBERT/ViT/BigBird and other late
Table of contents : Table of Contents Introduction Latest Trend in Deep Learning Cautions Disclaimer Trademarks Feedback Jupyter Notebook Chapter 1 pipeline 1:Set up Google’s Colaboratory Environment 2:Sentiment Analysis 3:Question Answering Chapter 2 Fine-tuning and Evaluation of DistilBERT using real data Preparation: GPU preparation 4:IMDB Data Set 5:Label Encoding 6:Split training and validation data 7:Tokenize and Encoding 8:Creating your own dataset class 9:Load Pre-trained Model(DistilBertForSequenceClassification) 10:Define TrainingArguments 11:Transfer to GPU 12:Fine-tuning by Trainer class 13:Fine-Tuning by Pytorch Chapter 3 Model Performance Evaluation 14:Accuracy 15:Recall/Precision/F1-Score 16:Classification Report Chapter4 Composition using GPT series 17:Preparing a writing environment with GPT Neo 18:Tokenize by GPT-Neo 19:Composition by GPT-Neo 20:distilgpt2 environment setting 21:Composition by distilgpt2 22:DialoGPT Environment Setting 23:Composition by DialoGPT Chapter 5 MLM(Masked Language Model) 24:MLM pipleline loading BERT 25:MLM pipleline loading DistilBERT 26:MLM pipleline loading ALBERT Chapter6 CLIP~Bridging Image Recognition and Natural Language Processing~ 27:CLIP module install 28:Sample Image Dataset 29:Load CLIP based pre-trained model 30:Check the network of CLIP based pre-trainedmodel 31:CLIP Preprocessing 32:Check the image after preprocessing 33:Encode and Decode 34:inference by CLIP 35:Get the logit of CLIP inference 36:Display the CLIP caption prediction result Chapter7 Wave2Vec2 Automatic Speech Recognition 37:Wav2Vec module install 38:Load Pre-trained Wav2Vec2 39:Preparing a Data Set for Automatic Speech Recognition(TIMIT_ASR) 40:Check the audio data in Colab 41:Wav2Vec2 Pre-processing 42:ASR by Wav2Vec2 Chapter 8 Multi-class classification in BERT 43:Load the pre-trained BERT for Multi-class classification 44:Pre-pare our own dataset for three-class classification of BERT 45:BERT Classification before fine-tuning 46:BERT fine-tuning for 3 class classification 47:Visualizing the learning process of Fine-tuning BERT for Three-Class Classification 48:BERT Classification after fine-tuning 49:Classification accuracy Chapter9 Automatic Summarization by BART 50:Setting up the BART library and loading the pre-training model 51:Preprocessing using regular expressions 52:Tokenizing with the BART prior learning model 53:Cast the BART tokenize output to numpy array 54:BART Inference 55:Decode the BART inference’s result Chapter10 Ensemble learning with two BERTs 56:Setting up the BERT ensemble learning library 57:Preparation of dataset of your own for BERT ensemble 58:Definition of BERT ensemble network 59:Load the pretrained BERT for ensemble training 60:BERT ensemble learning Data Augmentation 61:BERT ensemble learning Defining a custom dataset 62:BERT Ensemble Learning: DataLoader 63:BERT ensemble fine-tuning 64:BERT ensemble learning prediction using training data 65:BERT ensemble learning Prediction outside of training data Chapter11 BigBird 66:Setting up the BigBird library and loading the pre-training model 67:Preparation of Data for BigBird inference 68:BigBird tokenization and encoding 69:BigBird inference Chapter12 PEGASUS 70:PEGASUS library setup and pre-training model loading 71:Tokenization and Encode 72:PEGASUS Automatic Summarization Chapter 13 M2M100 73:Install the M2M100 library and load the pre-training model 74:Preparation of M2M100 translation source (Chinese text) 75:M2M100 Tokenize in source language 76:M2M100 automatic translation 77:M2M100 Decode the output of generate method 78:M2M100 Specify source language (Japanese) and create text 79:M2M100 Japanese text tokenization 80:M2M100 Japanese/English translation 81: M2M100 Japanese to English Translation Decode Chapter14 Mobile BERT 82:Install the MobileBERT library and load the pre-training model Code(MOBILE BERT) Code(BERT) 83:Mobile BERT vs. BERT Tokenizer 84:Last hidden layer during Mobile BERT inference 85:Mobile BERT Fill-in-the-Blanks Quiz Chapter15 GPT, DialoGPT, DistilGPT2 86:Setting up the DistilGPT2 library and loading the pre-training model 87:Visualization with distilgpt2 tool 88:distilgpt2 text generation 89:Loading DialoGPT (Dialogue Text Pre-Learning Model) 90:Text Generation by DialoGPT Chapter16 Practical exercise Moderna v.s. Pfizer (compare with BERT and tSNE) 91:Wikipediaからキーワード検索 92:Retrieved from Wikipedia "Moderna COVID-19 vaccine" full text 93:Retrieved from Wikipedia, Pfizer–BioNTech COVID-19 vaccine 94:Installing a module to handle document vectors in BERT 95:Load the pre-trained BERT to pipeline 96:Get document vector representations by BERT 97:Meaning of Vector Dimensionality in BERT 98: Definition of the function getting the document vector representation of BERT [CLS] token and Simple Preprocessing for BERT 99:Get BERT [CLS] vectors of Moderna/Pfizer Covid-19 vaccine 100: Frequency aggregation by tokenizer 101: Visualization by t-SNE "Moderna" v.s. "Pfizer". Reference In Closing