9.
Huggingface🤗NLP笔记8:使用PyTorch来微调模型「初级教程完结撒花❀❀」
10.
simpleAi index
11.
Huggingface Transformer教程
12.
GPT-2 使用指南:从Finetune到部署
13.
GPT2-Pytorch with Text-Generator
部署到flask中
微调GPT-2 使用pytorch
16.翻译后
翻译15
这个可以用来复现代码
17
预训练模型专题_GPT2_模型代码学习笔记
pytorch 版本
微调
finetune-gpt2xl
训练集数据=原始数据
标签数据=下一个字符
使用trainer API 进行模型的训练
这里的 使用 pytorch 、 tf 、keras 等的训练过程还有有差别的、
主要是选择模型 和 对 文本数据的向量化
GPT2LMHeadModel类、GPT2Model类
from transformers import GPT2LMHeadModel, GPT2Tokenizer
import torch
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained('gpt2')
generated = tokenizer.encode("The Manhattan bridge")
context = torch.tensor([generated])
past_key_values = None
for i in range(30):
此时模型model返回的output为CausalLMOutputWithPastAndCrossAttentions类,
模型返回的logits以及past_key_values对象为其中的属性,
CausalLMOutputWithPastAndCrossAttentions(
loss=loss,
logits=lm_logits,
past_key_values=transformer_outputs.past_key_values,
hidden_states=transformer_outputs.hidden_states,
attentions=transformer_outputs.attentions,
cross_attentions=transformer_outputs.cross_attentions,
output = model(context, past_key_values=past_key_values)
past_key_values = output.past_key_values
token = torch.argmax(output.logits[..., -1, :])
context = token.unsqueeze(0)
generated += [token.tolist()]
sequence = tokenizer.decode(generated)
sequence = sequence.split(".")[:-1]
print(sequence)
import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel
import logging
logging.basicConfig(level=logging.INFO)
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
text = "Who was Jim Henson ? Jim Henson was a"
indexed_tokens = tokenizer.encode(text)
tokens_tensor = torch.tensor([indexed_tokens])
model = GPT2LMHeadModel.from_pretrained('gpt2')
model.eval()
tokens_tensor = tokens_tensor.to('cuda')
model.to('cuda')
with torch.no_grad():
outputs = model(tokens_tensor)
predictions = outputs[0]
predicted_index = torch.argmax(predictions[0, -1, :]).item()
predicted_text = tokenizer.decode(indexed_tokens + [predicted_index])
from transformers import GPT2LMHeadModel, GPT2Tokenizer
import torch
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained('gpt2')
generated = tokenizer.encode("The Manhattan bridge")
context = torch.tensor([generated])
past = None
for i in range(100):
output, past = model(context, past=past)
token = torch.argmax(output[..., -1, :])
generated += [token.tolist()]
context = token.unsqueeze(0)
sequence = tokenizer.decode(generated)
print(sequence)
复制代码