添加链接
link之家
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

@[toc]

在上一篇博文笔记中,见: 【35】Sequence序列网络介绍与使用 ,我介绍了如何利用pytorch提供的RNN与LSTM接口,来搭建一个时序网络,那么在这篇文章中就打算用RNN网络或者LSTM网络来做一些时序预测的相关任务。

这里提供了两个例子:

  • 1)使用RNN网络来预测正弦方波
  • 2)使用LSTM网络来预测周期性的航班数据
  • 其中,第一个例子是龙曲良老师授课的一个例子,第二个例子是见博客修改的,见参考资料。

    在下篇文章中,还会介绍一下LSTM的其他我觉得挺好玩的应用:

  • 3)使用LSTM网络来对文本分类
  • 4)使用LSTM网络来对图像分类
  • 5)使用LSTM网络来生成手写数字图像
  • 1. 时序预测任务——使用RNN网络来预测正弦方波

    这里设置一个简单的任务就是使用使用rnn来预测正弦曲线

    1.1 导入工具包

    import  numpy as np
    import  torch
    import  torch.nn as nn
    import  torch.optim as optim
    from    matplotlib import pyplot as plt
    
  • 设置的相关信息,取消科学计数法输出
  • np.set_printoptions(suppress=True)
    

    1.2 超参数设置

    num_time_steps = 50
    input_size = 1
    hidden_size = 16
    output_size = 1
    lr=0.01
    

    1.3 模型定义

    这里我设置了一个单层的RNN网络来预测下一个时序节点的波形,最后的波形预测用一个全连接层来实现。

    class Net(nn.Module):
        def __init__(self, ):
            super(Net, self).__init__()
            self.rnn = nn.RNN(
                input_size=input_size,
                hidden_size=hidden_size,
                num_layers=1,
                batch_first=True,
            for p in self.rnn.parameters():
              nn.init.normal_(p, mean=0.0, std=0.001)
            self.linear = nn.Linear(hidden_size, output_size)
        def forward(self, x, hidden_prev):
           out, hidden_prev = self.rnn(x, hidden_prev)
           # [b, seq, h]
           out = out.view(-1, hidden_size)
           out = self.linear(out)
           out = out.unsqueeze(dim=0)
           return out, hidden_prev
    

    1.4 模型训练

    model = Net()
    criterion = nn.MSELoss()
    optimizer = optim.Adam(model.parameters(), lr)
    hidden_prev = torch.zeros(1, 1, hidden_size)
    for iter in range(6000):
        # 随机从0-3区间中挑选一个数值
        start = np.random.randint(3, size=1)[0]
        # 在区间中间隔采样50(num_time_steps)个数据点
        time_steps = np.linspace(start, start + 10, num_time_steps)
        data = np.sin(time_steps)
        data = data.reshape(num_time_steps, 1)
        x = torch.tensor(data[:-1]).float().view(1, num_time_steps - 1, 1)
        y = torch.tensor(data[1:]).float().view(1, num_time_steps - 1, 1)
        output, hidden_prev = model(x, hidden_prev)
        hidden_prev = hidden_prev.detach()
        loss = criterion(output, y)
        model.zero_grad()
        loss.backward()
        # for p in model.parameters():
        #     print(p.grad.norm())
        # torch.nn.utils.clip_grad_norm_(p, 10)   # 梯度裁剪
        optimizer.step()
        if iter % 100 == 0:
            print("Iteration: {} loss {}".format(iter, loss.item()))
    

    训练结果输出:

    Iteration: 0 loss 0.6180879473686218
    Iteration: 100 loss 0.002067530993372202
    Iteration: 200 loss 0.0033898595720529556
    Iteration: 300 loss 0.0028499257750809193
    Iteration: 400 loss 0.0016402869950979948
    Iteration: 500 loss 0.0015365825966000557
    Iteration: 600 loss 0.0007872747373767197
    Iteration: 700 loss 0.0013388539664447308
    Iteration: 800 loss 0.0009592065471224487
    Iteration: 900 loss 0.00023990016779862344
    Iteration: 1000 loss 0.0014855732442811131
    Iteration: 1100 loss 0.0005326495738700032
    Iteration: 1200 loss 0.0006256766500882804
    Iteration: 1300 loss 0.0013620775425806642
    Iteration: 1400 loss 0.0010062839137390256
    Iteration: 1500 loss 0.0011015885975211859
    Iteration: 1600 loss 2.126746221620124e-05
    Iteration: 1700 loss 0.000390108791179955
    Iteration: 1800 loss 0.0016134703764691949
    Iteration: 1900 loss 0.001025702222250402
    Iteration: 2000 loss 0.0015523084439337254
    Iteration: 2100 loss 0.0003646172699518502
    Iteration: 2200 loss 0.000642314029391855
    Iteration: 2300 loss 0.0004722262092400342
    Iteration: 2400 loss 5.732586942031048e-05
    Iteration: 2500 loss 0.0013018838362768292
    Iteration: 2600 loss 0.0008787440019659698
    Iteration: 2700 loss 0.0009332495392300189
    Iteration: 2800 loss 0.0007348429062403738
    Iteration: 2900 loss 0.0004210669139865786
    Iteration: 3000 loss 0.0004240694979671389
    Iteration: 3100 loss 0.0005803901003673673
    Iteration: 3200 loss 0.00195941049605608
    Iteration: 3300 loss 8.851500751916319e-05
    Iteration: 3400 loss 0.00030293306917883456
    Iteration: 3500 loss 0.00031553610460832715
    Iteration: 3600 loss 0.00031228255829773843
    Iteration: 3700 loss 0.00039033975917845964
    Iteration: 3800 loss 0.00020553779904730618
    Iteration: 3900 loss 0.0011390189174562693
    Iteration: 4000 loss 0.00015774763596709818
    Iteration: 4100 loss 7.591523899463937e-05
    Iteration: 4200 loss 8.699461614014581e-05
    Iteration: 4300 loss 0.0004444770747795701
    Iteration: 4400 loss 0.0004794316191691905
    Iteration: 4500 loss 0.00047654256923124194
    Iteration: 4600 loss 0.0002331874711671844
    Iteration: 4700 loss 0.0003834210510831326
    Iteration: 4800 loss 0.0005701710470020771
    Iteration: 4900 loss 0.00010284259042236954
    Iteration: 5000 loss 5.453555058920756e-05
    Iteration: 5100 loss 0.00043016314157284796
    Iteration: 5200 loss 0.00013311025395523757
    Iteration: 5300 loss 0.0003936859138775617
    Iteration: 5400 loss 0.0002417201321804896
    Iteration: 5500 loss 0.00017509324243292212
    Iteration: 5600 loss 0.00017528091848362237
    Iteration: 5700 loss 0.00040555483428761363
    Iteration: 5800 loss 0.000313715310767293
    Iteration: 5900 loss 9.533046249998733e-05
    

    1.5 模型测试

    start = np.random.randint(3, size=1)[0]
    time_steps = np.linspace(start, start + 10, num_time_steps)
    data = np.sin(time_steps)
    data = data.reshape(num_time_steps, 1)
    x = torch.tensor(data[:-1]).float().view(1, num_time_steps - 1, 1)
    y = torch.tensor(data[1:]).float().view(1, num_time_steps - 1, 1)
    predictions = []
    input = x[:, 0, :]
    for _ in range(x.shape[1]):
      input = input.view(1, 1, 1)
      (pred, hidden_prev) = model(input, hidden_prev)
      input = pred
      predictions.append(pred.detach().numpy().ravel()[0])
    x = x.data.numpy().ravel()
    y = y.data.numpy()
    plt.scatter(time_steps[:-1], x.ravel(), s=90)
    plt.plot(time_steps[:-1], x.ravel())
    plt.scatter(time_steps[1:], predictions)
    plt.show()
    

    输出的可视化结果:

    1.6 完整代码

    import numpy as np
    import torch
    import torch.nn as nn
    import torch.optim as optim
    from matplotlib import pyplot as plt
    num_time_steps = 50
    input_size = 1
    hidden_size = 16
    output_size = 1
    lr = 0.01
    class Net(nn.Module):
        def __init__(self, ):
            super(Net, self).__init__()
            self.rnn = nn.RNN(
                input_size=input_size,
                hidden_size=hidden_size,
                num_layers=1,
                batch_first=True,
            for p in self.rnn.parameters():
              nn.init.normal_(p, mean=0.0, std=0.001)
            self.linear = nn.Linear(hidden_size, output_size)
        def forward(self, x, hidden_prev):
           out, hidden_prev = self.rnn(x, hidden_prev)
           # [b, seq, h]
           out = out.view(-1, hidden_size)
           out = self.linear(out)
           out = out.unsqueeze(dim=0)
           return out, hidden_prev
    model = Net()
    criterion = nn.MSELoss()
    optimizer = optim.Adam(model.parameters(), lr)
    hidden_prev = torch.zeros(1, 1, hidden_size)
    # 为什么可以做的一个思考:
    # 此时模型为当前的每一个输入都进行了一个输出,维度是一样的,也就是为每一个输入预测了一个输出
    # 而这个输出与下一个时序就构成了损失,可以用一个损失函数就这个过程构建一个损失函数即可训练
    for iter in range(6000):
        start = np.random.randint(3, size=1)[0]
        time_steps = np.linspace(start, start + 10, num_time_steps)
        data = np.sin(time_steps)
        data = data.reshape(num_time_steps, 1)
        x = torch.tensor(data[:-1]).float().view(1, num_time_steps - 1, 1)
        y = torch.tensor(data[1:]).float().view(1, num_time_steps - 1, 1)
        # output是包含了每一个样本的一个输出值,可以考虑最后的一个输出值hidden_prev,也可以考虑整个的输出值
        # 这里就是使用后者整个的时序样本的隐藏单元都作为一个训练样本
        output, hidden_prev = model(x, hidden_prev)
        hidden_prev = hidden_prev.detach()
        # 损失函数的构建
        loss = criterion(output, y)
        model.zero_grad()
        loss.backward()
        # for p in model.parameters():
        #     print(p.grad.norm())
        # torch.nn.utils.clip_grad_norm_(p, 10)
        optimizer.step()
        if iter % 100 == 0:
            print("Iteration: {} loss {}".format(iter, loss.item()))
    start = np.random.randint(3, size=1)[0]
    time_steps = np.linspace(start, start + 10, num_time_steps)
    data = np.sin(time_steps)
    data = data.reshape(num_time_steps, 1)
    x = torch.tensor(data[:-1]).float().view(1, num_time_steps - 1, 1)
    y = torch.tensor(data[1:]).float().view(1, num_time_steps - 1, 1)
    predictions = []
    input = x[:, 0, :]
    for _ in range(x.shape[1]):
      input = input.view(1, 1, 1)
      (pred, hidden_prev) = model(input, hidden_prev)  # 这里的hidden_prev是训练过程中的隐藏单元
      input = pred      # 当前的输入作为下一层的输入,不断的进行实现时序预测
      predictions.append(pred.detach().numpy().ravel()[0])  # 只通过一个输入,就可以连续不断的进行预测
    # 可视化展示,这里的y其实是没什么用的
    x = x.data.numpy().ravel()
    y = y.data.numpy()
    plt.scatter(time_steps[:-1], x.ravel(), s=90)
    plt.plot(time_steps[:-1], x.ravel())
    plt.scatter(time_steps[1:], predictions)
    plt.show()
    

    2. 时序任务预测——使用LSTM网络来预测周期性的航班数据

    2.1 导入工具包

    from sklearn.preprocessing import MinMaxScaler
    import torch
    import torch.nn as nn
    import seaborn as sns
    import numpy as np
    import pandas as pd
    import matplotlib.pyplot as plt
    %matplotlib inline
    

    2.2 数据可视化

  • 查看数据集的表格
  • flight_data = sns.load_dataset("flights")
    flight_data.head()
        .dataframe tbody tr th:only-of-type {
            vertical-align: middle;
    .dataframe tbody tr th {
        vertical-align: top;
    .dataframe thead th {
        text-align: right;
    fig_size[1] = 5
    plt.rcParams["figure.figsize"] = fig_size
    plt.title('Month vs Passenger')
    plt.ylabel('Total Passengers')
    plt.xlabel('Months')
    plt.grid(True)
    plt.autoscale(axis='x',tight=True)
    plt.plot(flight_data['passengers'])
    plt.show()
    

    2.3 数据预处理

  • 构建训练数据与测试数据
  • all_data = flight_data['passengers'].values.astype(float)
    test_data_size = 12
    train_data = all_data[:-test_data_size]
    test_data = all_data[-test_data_size:]
    print(len(train_data))
    print(len(test_data))
    
  • 归一化处理
  • # 使用最小/最大标量(最小值和最大值分别为-1和1)对数据进行规范化
    scaler = MinMaxScaler(feature_range=(-1, 1))
    # 这里因为只能对一列进行处理,所以涉及一些维度的转变
    train_data_normalized = scaler.fit_transform(train_data .reshape(-1, 1))  
    # 将数据集转换为张量,然后将列再重新打平处理
    train_data_normalized = torch.FloatTensor(train_data_normalized).view(-1)
    

    2.4 构建数据集

    # 创建用于训练的序列和相应的标签
    train_window = 12
    def create_inout_sequences(input_data, tw):
        inout_seq = []
        L = len(input_data)
        for i in range(L-tw):
            train_seq = input_data[i:i+tw]
            train_label = input_data[i+tw:i+tw+1]
            inout_seq.append((train_seq ,train_label))
        return inout_seq
    train_inout_seq = create_inout_sequences(train_data_normalized, train_window)
    

    2.5 模型创建

    这里普通搭建了一个单层的LSTM网络来预测时序数据

    class LSTM(nn.Module):
        def __init__(self, input_size=1, hidden_layer_size=200, output_size=1):
            super().__init__()
            self.hidden_layer_size = hidden_layer_size
            self.lstm = nn.LSTM(input_size=input_size, hidden_size=hidden_layer_size, num_layers=1)
            self.linear = nn.Linear(hidden_layer_size, output_size)
            self.hidden_cell = (torch.zeros(1, 1, self.hidden_layer_size),
                                torch.zeros(1, 1, self.hidden_layer_size))
        def forward(self, input_seq):
            lstm_out, self.hidden_cell = self.lstm(input_seq, self.hidden_cell)
            predictions = self.linear(lstm_out.view(lstm_out.size(0), -1))
            return predictions[-1]
    model = LSTM()
    print(model)
    seq, labels = train_inout_seq[0]
    print("seq.shape:{}, labels.shape:{}".format(seq.shape, labels.shape))
    model(seq[:,None,None]).shape
    
    LSTM(
      (lstm): LSTM(1, 200)
      (linear): Linear(in_features=200, out_features=1, bias=True)
    torch.Size([12]) torch.Size([1])
    torch.Size([1])
    

    2.6 模型训练

    model = LSTM()
    loss_function = nn.MSELoss()
    optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
    epochs = 200
    for i in range(epochs):
        for seq, labels in train_inout_seq:
            optimizer.zero_grad()
            model.hidden_cell = (torch.zeros(1, 1, model.hidden_layer_size),
                                 torch.zeros(1, 1, model.hidden_layer_size))
            y_pred = model(seq[:,None,None])
            single_loss = loss_function(y_pred, labels)
            single_loss.backward()
            optimizer.step()
        if i%25 == 0 or i == (epochs-1):
            print(f'epoch: {i:3} loss: {single_loss.item():10.8f}')
    print("-"*80)
    print("train sucess, loss:{}".format(single_loss))
    
    epoch:   0 loss: 0.08318637
    epoch:  24 loss: 0.00497702
    epoch:  49 loss: 0.00723564
    epoch:  74 loss: 0.00158882
    epoch:  99 loss: 0.00002601
    epoch: 124 loss: 0.00142831
    epoch: 149 loss: 0.00005675
    epoch: 174 loss: 0.00011440
    epoch: 199 loss: 0.00020664
    --------------------------------------------------------------------------------
    train sucess, loss:0.0002066389424726367
    

    2.7 模型测试

    fut_pred = 12
    test_inputs = train_data_normalized[-train_window:].tolist()
    len(test_inputs), train_data_normalized.shape
    # 输出:  (12, torch.Size([132]))
    model.eval()
    # 预测十二个数值
    for i in range(fut_pred):
        #不断取出列表的最后12个数值来喂入模型,获取输出
        seq = torch.FloatTensor(test_inputs[-train_window:])
        with torch.no_grad():
            model.hidden = (torch.zeros(1, 1, model.hidden_layer_size),
                            torch.zeros(1, 1, model.hidden_layer_size))
            pred = model(seq[:,None,None]).item()
            test_inputs.append(pred)
            print('i:{}, pred:{}'.format(i, pred))
    
    i:0, pred:0.454874724149704
    i:1, pred:0.5675606727600098
    i:2, pred:0.8822157979011536
    i:3, pred:0.781266987323761
    i:4, pred:0.5928680896759033
    i:5, pred:0.5053937435150146
    i:6, pred:0.7401428818702698
    i:7, pred:1.0358431339263916
    i:8, pred:1.2730270624160767
    i:9, pred:1.2577072381973267
    i:10, pred:1.681006669998169
    i:11, pred:1.2105796337127686
    

    转换为归一化前的数值

    # 将预测出来的数据进行反归一化处理,同样的这里需要将其转换为列数据才可以反归一化转化
    actual_predictions = scaler.inverse_transform(np.array(test_inputs).reshape(-1, 1))
    actual_predictions.flatten().tolist()
    
    [360.0000016912818,
     342.0000002216548,
     406.0000023394823,
     396.00000293552876,
     420.0000015050173,
     472.00000518560415,
     548.0000006556511,
     559.0,
     463.00000572204596,
     407.00000227987766,
     362.0000015720725,
     405.00000239908695,
     434.98399974405766,
     460.6200530529023,
     532.2040940225124,
     509.2382396161557,
     466.37749040126806,
     446.47707664966583,
     499.88250562548643,
     567.1543129682541,
     621.1136566996574,
     617.6283966898918,
     713.9290174245834,
     606.9068666696548]
    

    2.8 可视化展示

    x = np.arange(120, 144, 1)
    print(x)
    plt.title('Month vs Passenger')
    plt.ylabel('Total Passengers')
    plt.grid(True)
    plt.autoscale(axis='x', tight=True)
    plt.plot(flight_data['passengers'])
    plt.plot(x,actual_predictions)
    plt.show()
    
    [120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137
     138 139 140 141 142 143]
    

    这里可以看出来,可以预测部分数据,但是显然预测的准确性不是很高,但是后面有回落的势头,所以基本的波形还是可以预测的

    这里其实可以看见,在rnn与lstm训练时序数据的时候,可以有两种训练方式。

  • 第一种是利用整段序列的下一个时间戳整段预测输出来与整段序列的真实标签来进行批量训练
  • 第二种是利用整段序列的下一个时间戳最后一个预测点来与时间戳的样本真实点来进行单个训练
  • 这两种方法,在上诉的两个例子中分别使用了一个,都是可行的。

    参考资料:

    1. 机器学习 Pytorch实现案例 LSTM案例(航班人数预测)

    2. Time Series Prediction using LSTM with PyTorch in Python

  • 私信