添加链接
link之家
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
ONNX小结

ONNX小结

一、简介

工作中一直会用到ONNX,抽时间对ONNX做了一些总结。

ONNX是Open Neural Network Exchange的简称,创始人是贾扬清,从2017年由微软、facebook等几个公司联合推出,它定义了一个统一的中间格式,用以存储训练好的模型,这就使得不同的训练框架可以使用相同的中间格式进行交互,下图直观展示了这个目的:

图1

现在基本上主流的训练和推理框架都已经支持了ONNX,它作为一种中间格式,是使用的protobuf来存储网络模型,但使用protobuf有不少缺点,比如最大文件大小限制在2GB、文件人类不可读、protobuf的编译安装使用有学习成本等,尽管它不是最优的选择,不过目前已经广泛的用起来了,也不可能再改了。

onnx规范目前有两个变体,主要区别在与支持的类型和默认的operator集合。onnx神经网络变体只使用tensor作为输入和输出,而作为支持传统机器学习模型的onnx-ml,还可以识别sequence和map,onnx-ml为支持非神经网络算法扩展了onnx operator集合,本文只关注onnx神经网络这个变体。

二、ONNX模型结构

ONNX的模型结构定义位于 github.com/onnx/onnx/bl ,比较常用的几个结构为:

  • ModelProto
  • GraphProto
  • NodeProto
  • TensorProto
  • ValueInfoProto

它们之间的关系大概如下图所示:

图2

因为onnx模型是用的protobuf来存储的,无法直接打开查看,可以借助下面方法来查看模型内容

  1. 用protobuf配套的protoc命令:
protoc --decode=onnx.ModelProto onnx.proto < xxx.onnx

2. 用netron这个可视化工具,这个大家应该都知道

3. 在python中直接print(model)

三、python和c++接口

3.1 python接口

python的所有接口可以参考onnx的官方源代码:

这里将使用经典网络resnet中的残差块创建为例,来展示一下python接口的用法,下面先简单介绍一下resnet以及其残差块,resnet网络是参考了vgg19的网络,在其基础上进行了修改,并通过短路机制加入了残差单元,变化主要体现在resnet直接使用stride=2的卷积做下采样,并且用global average pool层替换了全连接层。

再看一下残差单元,下图是残差单元的基本形式:

图3

resnet实际上使用了两种残差单元,如下图所示,左侧对应的是浅层网络,右侧对应的是深层网络:

图4

对于短路连接,当输入和输出维度一致时,可以直接将输入加到输出上,对应上图的左侧部分,resnet18、resnet34就是采用的这种Residual block,但是当维度不一致时(对应的是维度增加一倍),这就不能直接相加,论文中提出了BottleNeck Residual block,通过使用1x1 conv来巧妙地减小或扩大output channel个数,从而使得我们的3x3 conv的output channel数目不受外界的影响,resnet50、resnet101、resnet152主要就是采用的是这种残差块。

下面用python接口展示一下BottleNeck Residual block的创建,主要的代码如下:

import numpy as np
import onnx
from onnx import helper
from onnx import TensorProto
def make_initializer_tensor(name, dims) -> TensorProto:
    value = np.random.random(dims).astype(np.float32)
    tensor = helper.make_tensor(
        name=name,
        data_type=helper.TensorProto.DataType.FLOAT,
        dims=list(value.shape),
        vals=value.tobytes(),
        raw=True
    return tensor
input = helper.make_tensor_value_info(
    'conv1_input', TensorProto.FLOAT, [1, 128, 56, 56])
# ----------------- Convolution 1x1 -----------------
w1 = make_initializer_tensor("conv1_w", [64, 128, 1, 1])
conv1_output = helper.make_tensor_value_info(
    'conv1_output', TensorProto.FLOAT, [1, 64, 56, 56])
conv1 = helper.make_node(
    "Conv",
    inputs=["conv1_input", "conv1_w"],
    outputs=["conv1_output"],
    kernel_shape=[1, 1],
    strides=[1, 1],
    dilations=[1, 1],
    group=1,
    pads=[0, 0, 0, 0],
relu1_output = helper.make_tensor_value_info(
    'relu1_output', TensorProto.FLOAT, [1, 64, 56, 56])
relu1 = helper.make_node(
    "Relu", inputs=["conv1_output"], outputs=["relu1_output"])
# ----------------- Convolution 3x3 -----------------
w2 = make_initializer_tensor("conv2_w", [64, 64, 3, 3])
conv2_output = helper.make_tensor_value_info(
    'conv2_output', TensorProto.FLOAT, [1, 64, 56, 56])
conv2 = helper.make_node(
    "Conv",
    inputs=["relu1_output", "conv2_w"],
    outputs=["conv2_output"],
    kernel_shape=[3, 3],
    strides=[1, 1],
    dilations=[1, 1],
    group=1,
    pads=[1, 1, 1, 1],
relu2_output = helper.make_tensor_value_info(
    'relu2_output', TensorProto.FLOAT, [1, 64, 56, 56])
relu2 = helper.make_node(
    "Relu", inputs=["conv2_output"], outputs=["relu2_output"])
# ----------------- Convolution 1x1 -----------------
w3 = make_initializer_tensor("conv3_w", [128, 64, 1, 1])
conv3_output = helper.make_tensor_value_info(
    'conv3_output', TensorProto.FLOAT, [1, 128, 56, 56])
conv3 = helper.make_node(
    "Conv",
    inputs=["relu2_output", "conv3_w"],
    outputs=["conv3_output"],
    kernel_shape=[1, 1],
    strides=[1, 1],
    dilations=[1, 1],
    group=1,
    pads=[0, 0, 0, 0],
add_output = helper.make_tensor_value_info(
    'add_output', TensorProto.FLOAT, [1, 128, 56, 56])
add = helper.make_node(
    "Add", inputs=["conv3_output", "conv1_input"], outputs=["add_output"])
# graph and model
graph = helper.make_graph(
    nodes=[conv1, relu1, conv2, relu2, conv3, add],
    name="residual_block",
    inputs=[input],
    outputs=[add_output],
    initializer=[w1, w2, w3],
    value_info=[conv1_output, relu1_output, conv2_output,
                relu2_output, conv3_output, add_output]
model = helper.make_model(graph)
# save model
onnx.checker.check_model(model)
onnx.save(model, "bottleneck_residual_block.onnx")

把上面保存的模型用netron可视化之后如下图所示:

图5

如果需要创建整个resnet网络,可以进一步将上面代码进行封装成一个函数,专门用来创建残差块。

3.2 C++接口

onnx没有提供像python一样的C++接口,需要用户自己来写,这个代码写起来相对比较麻烦一些,需要自己来创建填写所有的细节,比如ModelProto的内容、GraphProto的内容、NodeProto的内容等,其中的每一个部分都可能涉及到很多层,比如GraphProto中ValueInfoProto类型的input的创建,里面又包含了TypeProto,TypeProto中又包含Tensor,Tensor中又包含TensorShapeProto,处理起来比较繁琐,下面是一个用C++代码创建Residual block的例子:

#include <google/protobuf/io/zero_copy_stream_impl.h>
#include "misc_common.hpp"
#include "onnx.pb.h"
void setValueInfoProto(onnx::ValueInfoProto* value_info, const string& name,
                       const vector<int64_t>& shape, int64_t elem_type) {
  value_info->set_name(name);
  auto* tensor_type = value_info->mutable_type()->mutable_tensor_type();
  tensor_type->set_elem_type(elem_type);
  auto* tensor_shape = tensor_type->mutable_shape();
  for (auto dim : shape) {
    auto* dim_proto = tensor_shape->add_dim();
    dim_proto->set_dim_value(dim);
void setTensorProto(onnx::TensorProto* tensor, const string& name,
                    const vector<int64_t>& shape, int64_t elem_type) {
  tensor->set_name(name);
  tensor->set_data_type(elem_type);
  auto size = 1ul;
  for (auto dim : shape) {
    size *= dim;
    tensor->add_dims(dim);
  string raw_data(size * sizeof(float), 0);
  tensor->set_raw_data(raw_data);
void setConvNode(const string& node_name, onnx::NodeProto* node,
                 const vector<string>& input_name, const string& output_name,
                 const vector<int64_t>& kernel_shape,
                 const vector<int64_t>& strides, const vector<int64_t>& pads) {
  node->set_name(node_name);
  node->set_op_type("Conv");
  for (auto name : input_name) {
    node->add_input(name);
  node->add_output(output_name);
  auto* attr = node->add_attribute();
  attr->set_name("kernel_shape");
  attr->set_type(onnx::AttributeProto_AttributeType_INTS);
  for (auto dim : kernel_shape) {
    attr->add_ints(dim);
  attr = node->add_attribute();
  attr->set_name("strides");
  attr->set_type(onnx::AttributeProto_AttributeType_INTS);
  for (auto dim : strides) {
    attr->add_ints(dim);
  attr = node->add_attribute();
  attr->set_name("pads");
  attr->set_type(onnx::AttributeProto_AttributeType_INTS);
  for (auto dim : pads) {
    attr->add_ints(dim);
void setReluNode(const string& node_name, onnx::NodeProto* node,
                 const string& input_name, const string& output_name) {
  node->set_name(node_name);
  node->set_op_type("Relu");
  node->add_input(input_name);
  node->add_output(output_name);
void setAddNode(const string& node_name, onnx::NodeProto* node,
                const string& input_name0, const string& input_name1,
                const string& output_name) {
  node->set_name(node_name);
  node->set_op_type("Add");
  node->add_input(input_name0);
  node->add_input(input_name1);
  node->add_output(output_name);
int main(int argc, char* argv[]) {
  auto model_proto = make_unique<onnx::ModelProto>();
  // model information
  model_proto->set_ir_version(6);
  model_proto->set_domain("");
  auto* opset = model_proto->add_opset_import();
  opset->set_domain("");
  opset->set_version(11);
  // graph information, including nodes, inputs, outputs, etc.
  auto* graph = model_proto->mutable_graph();
  graph->set_name("residual_block");
  auto* input = graph->add_input();
  auto* output = graph->add_output();
  setValueInfoProto(input, "input", {1, 128, 56, 56},
                    onnx::TensorProto_DataType_FLOAT);
  setValueInfoProto(output, "output", {1, 128, 56, 56},
                    onnx::TensorProto_DataType_FLOAT);
  auto* conv1_output = graph->add_value_info();
  auto* relu1_output = graph->add_value_info();
  auto* conv2_output = graph->add_value_info();
  auto* relu2_output = graph->add_value_info();
  auto* conv3_output = graph->add_value_info();
  setValueInfoProto(conv1_output, "conv1_output", {1, 64, 56, 56},
                    onnx::TensorProto_DataType_FLOAT);
  setValueInfoProto(relu1_output, "relu1_output", {1, 64, 56, 56},
                    onnx::TensorProto_DataType_FLOAT);
  setValueInfoProto(conv2_output, "conv2_output", {1, 64, 56, 56},
                    onnx::TensorProto_DataType_FLOAT);
  setValueInfoProto(relu2_output, "relu2_output", {1, 64, 56, 56},
                    onnx::TensorProto_DataType_FLOAT);
  setValueInfoProto(conv3_output, "conv3_output", {1, 128, 56, 56},
                    onnx::TensorProto_DataType_FLOAT);
  auto* conv1_weight = graph->add_initializer();
  auto* conv2_weight = graph->add_initializer();
  auto* conv3_weight = graph->add_initializer();
  setTensorProto(conv1_weight, "conv1_weight", {64, 128, 1, 1},
                 onnx::TensorProto_DataType_FLOAT);
  setTensorProto(conv2_weight, "conv2_weight", {64, 64, 3, 3},
                 onnx::TensorProto_DataType_FLOAT);
  setTensorProto(conv3_weight, "conv3_weight", {128, 64, 1, 1},
                 onnx::TensorProto_DataType_FLOAT);
  // node information, includes node name, op type, inputs, outputs, etc.
  auto* conv1_node = graph->add_node();
  auto* relu1_node = graph->add_node();
  auto* conv2_node = graph->add_node();
  auto* relu2_node = graph->add_node();
  auto* conv3_node = graph->add_node();
  auto* add_node = graph->add_node();
  setConvNode("conv1", conv1_node, vector<string>{"input", "conv1_weight"},
              "conv1_output", {1, 1}, {1, 1}, {0, 0, 0, 0});
  setReluNode("relu1", relu1_node, "conv1_output", "relu1_output");
  setConvNode("conv2", conv2_node,
              vector<string>{"relu1_output", "conv2_weight"}, "conv2_output",
              {3, 3}, {1, 1}, {1, 1, 1, 1});
  setReluNode("relu2", relu2_node, "conv2_output", "relu2_output");
  setConvNode("conv3", conv3_node,
              vector<string>{"relu2_output", "conv3_weight"}, "conv3_output",
              {1, 1}, {1, 1}, {0, 0, 0, 0});
  setAddNode("add0", add_node, "input", "conv3_output", "output");
  // save model to file
  std::ofstream ofs("test.onnx", std::ifstream::out | std::ifstream::binary);
  model_proto->SerializeToOstream(&ofs);
  return 0;

生成的模型用netron可视化之后如下所示:

图6

四、新特性

这部分内容很多学习自于onnx社区大老师的一次onnx分享,具体的视频链接贴在了最后的reference中,目前onnx最新的版本是1.13,下面这些新的特性并不是都在1.13推出的,像hub在1.11中就有了,parser等在1.12中就已经有了,另外这里也没有把所有的新特性都列出,比如对于training支持的特性就不做描述了,这里只列一些个人感觉比较重要的feature出来。

4.1 hub

hub是onnx官方维护的一个模型仓库,我在网上找了一段感觉比较准确的描述:

The ONNX Model Hub is a simple and fast way to get started with state of the art pre-trained ONNX models from the ONNX Model Zoo. Furthermore, this allows researchers and model developers the opportunity to share their pre-trained models with the broader community.The ONNX Model hub is available after ONNX 1.11.0.

onnx hub包含客户端和服务端,客户端代码位于 github.com/onnx/onnx/bl ,通过 github.com/onnx/models/ 这个文件来获得所有的模型metadata信息,下图是网上找的一张onnx hub的基本架构图,感觉画的不太好,但只找到这么一张,只能凑合看:

图7

到写本文为止,我查了一下,一共有171个模型,通过下面代码可以得到所有的模型list:

from onnx import hub
models = hub.list_models()

如果我们想下载resnet50,直接使用 hub.load("resnet50") 即可,也可以通过 hub.get_model_info("resnet50") 来获取这个模型的一些信息,如下所示:

>>> hub.get_model_info("resnet50")
ModelInfo(model=ResNet50, opset=7, path=vision/classification/resnet/model/resnet50-v1-7.onnx, metadata={'model_sha': 'af16a04a6ec48ac494065d4439fe9dea590d337b9ca6dc328160ccf04a217b9c', 'model_bytes': 102583340, 'tags': ['vision', 'classification', 'resnet'], 'io_ports': {'inputs': [{'name': 'data', 'shape': ['N', 3, 224, 224], 'type': 'tensor(float)'}], 'outputs': [{'name': 'resnetv17_dense0_fwd', 'shape': ['N', 1000], 'type': 'tensor(float)'}]}, 'extra_ports': {'features': [{'name': 'resnetv17_pool1_fwd', 'shape': [0, 2048, 1, 1]}]}, 'model_with_data_path': 'vision/classification/resnet/model/resnet50-v1-7.tar.gz', 'model_with_data_sha': '898a9183256e884ae06fb1c11869386eccd38393ab41d9339909e974519a9feb', 'model_with_data_bytes': 95435189})

4.2 parser

在前面第三章中可以看到,无论是使用python接口,还是使用C++接口,创建一个onnx model都很麻烦,onnx parser在一定程度上可以解决这个问题,还以3.1所展示的创建一个残差块为例,使用parser之后,代码可以简化成如下形式:

import onnx
from onnx import parser
m = parser.parse_model("""
        ir_version: 8,
        opset_import: ["" : 17]
    residual_block (float[1,128,56,56] input)=>(float[1,128,56,56] add_output)
        conv1_w=Constant()
        conv1_output=Conv<kernel_shape=[1, 1],strides=[1, 1],pads=[0, 0, 0, 0]>(input, conv1_w)
        relu1_output=Relu(conv1_output)
        conv2_w=Constant()
        conv2_output=Conv<kernel_shape=[3, 3],strides=[1, 1],pads=[1, 1, 1, 1]>(relu1_output, conv2_w)
        relu2_output=Relu(conv2_output)
        conv3_w=Constant()
        conv3_output=Conv<kernel_shape=[1, 1],strides=[1, 1],pads=[0, 0, 0, 0]>(relu2_output, conv3_w)
        add_output=Add(input,conv3_output)
print(m)
onnx.checker.check_model(m)
onnx.save(m, 'residual_bottleneck_block.onnx')

可见代码比3.1中的代码简化了很多,生成的model用netron可视化之后如下图所示:

图8

但是这种方法目前感觉也不够完善,对于conv op来说,所需要的weight输入只能通过Constant表达式才能生成,而且不方便在文本段中给Constant赋值,想到一种方法,就是在文本段外用读文件或者np.random随机weight数据出来,然后在文本段内引用文本段外的id,但是尝试了一下没有成功,也许目前不支持这样做。

对于parser文本段的语法规则在官方的这个文档:

感觉不够直观,需要一些学习才能用起来,不知道以后会不会能改良一下

4.3 function

这部分功能目前了解的不多,大概是可以通过已有的小算子拼出一个大的算子,下面是onnx.proto中对于FunctionProto的定义,省略了一些不重要的字段:

message FunctionProto {
  // The name of the function, similar usage of op_type in OperatorProto.
  // Combined with FunctionProto.domain, this forms the unique identity of
  // the FunctionProto.
  optional string name = 1;
  // The inputs and outputs of the function.
  repeated string input = 4;
  repeated string output = 5;
  // The attributes of the function.
  repeated string attribute = 6;
  // The nodes in the function.
  repeated NodeProto node = 7;
  // The domain which this function belongs to. Combined with FunctionProto.name, this forms the unique identity of
  // the FunctionProto.
  optional string domain = 10;
}

感觉function有点像一个subgraph,有repeated nodes,有相应的inputs和outputs,他是作为ModelProto的成员被序列化到模型中的:

message ModelProto {
  // A list of function protos local to the model.
  // Name of the function "FunctionProto.name" should be unique within the domain "FunctionProto.domain".
  // In case of any conflicts the behavior (whether the model local functions are given higher priority,
  // or standard opserator sets are given higher priotity or this is treated as error) is defined by
  // the runtimes.
  // The operator sets imported by FunctionProto should be compatible with the ones
  // imported by ModelProto and other model local FunctionProtos.
  // Example, if same operator set say 'A' is imported by a FunctionProto and ModelProto
  // or by 2 FunctionProtos then versions for the operator set may be different but,
  // the operator schema returned for op_type, domain, version combination
  // for both the versions should be same for every node in the function body.
  // One FunctionProto can reference other FunctionProto in the model, however, recursive reference
  // is not allowed.
  repeated FunctionProto functions = 25;
};

官方文档 ONNX with function 有一个创建function到model的例子:

import numpy
from onnx import numpy_helper, TensorProto
from onnx.helper import (
    make_model, make_node, set_model_props, make_tensor,
    make_graph, make_tensor_value_info, make_opsetid,
    make_function)
from onnx.checker import check_model
new_domain = 'custom'
opset_imports = [make_opsetid("", 14), make_opsetid(new_domain, 1)]
# Let's define a function for a linear regression
node1 = make_node('MatMul', ['X', 'A'], ['XA'])
node2 = make_node('Add', ['XA', 'B'], ['Y'])
linear_regression = make_function(
    new_domain,            # domain name
    'LinearRegression',     # function name
    ['X', 'A', 'B'],        # input names
    ['Y'],                  # output names
    [node1, node2],         # nodes
    opset_imports,          # opsets
    [])                     # attribute names
# Let's use it in a graph.
X = make_tensor_value_info('X', TensorProto.FLOAT, [None, None])
A = make_tensor_value_info('A', TensorProto.FLOAT, [None, None])
B = make_tensor_value_info('B', TensorProto.FLOAT, [None, None])
Y = make_tensor_value_info('Y', TensorProto.FLOAT, [None])
graph = make_graph(
    [make_node('LinearRegression', ['X', 'A', 'B'], ['Y1'], domain=new_domain),
     make_node('Abs', ['Y1'], ['Y'])],
    'example',
    [X, A, B], [Y])
onnx_model = make_model(
    graph, opset_imports=opset_imports,
    functions=[linear_regression])  # functions to add)
check_model(onnx_model)
# the work is done, let's display it...
print(onnx_model)

大老师的onnx新特性视频中也有一个相关示例,摘录在下面供大家参考:

import onnx.parser
from onnx.onnx_pb import FunctionProto
input = """
    agraph (float[N, 128] X) => (float[N, 128] Y)
        Y = Softmax(X)
graph = onnx.parser.parse_graph(input)
node = graph.node[0]
schema = onnx.defs.get_schema(node.op_type, node.domain)