ONNX小结
一、简介
工作中一直会用到ONNX,抽时间对ONNX做了一些总结。
ONNX是Open Neural Network Exchange的简称,创始人是贾扬清,从2017年由微软、facebook等几个公司联合推出,它定义了一个统一的中间格式,用以存储训练好的模型,这就使得不同的训练框架可以使用相同的中间格式进行交互,下图直观展示了这个目的:
现在基本上主流的训练和推理框架都已经支持了ONNX,它作为一种中间格式,是使用的protobuf来存储网络模型,但使用protobuf有不少缺点,比如最大文件大小限制在2GB、文件人类不可读、protobuf的编译安装使用有学习成本等,尽管它不是最优的选择,不过目前已经广泛的用起来了,也不可能再改了。
onnx规范目前有两个变体,主要区别在与支持的类型和默认的operator集合。onnx神经网络变体只使用tensor作为输入和输出,而作为支持传统机器学习模型的onnx-ml,还可以识别sequence和map,onnx-ml为支持非神经网络算法扩展了onnx operator集合,本文只关注onnx神经网络这个变体。
二、ONNX模型结构
ONNX的模型结构定义位于 https:// github.com/onnx/onnx/bl ob/main/onnx/onnx.proto ,比较常用的几个结构为:
- ModelProto
- GraphProto
- NodeProto
- TensorProto
- ValueInfoProto
它们之间的关系大概如下图所示:
因为onnx模型是用的protobuf来存储的,无法直接打开查看,可以借助下面方法来查看模型内容
- 用protobuf配套的protoc命令:
protoc --decode=onnx.ModelProto onnx.proto < xxx.onnx
2. 用netron这个可视化工具,这个大家应该都知道
3. 在python中直接print(model)
三、python和c++接口
3.1 python接口
python的所有接口可以参考onnx的官方源代码:
这里将使用经典网络resnet中的残差块创建为例,来展示一下python接口的用法,下面先简单介绍一下resnet以及其残差块,resnet网络是参考了vgg19的网络,在其基础上进行了修改,并通过短路机制加入了残差单元,变化主要体现在resnet直接使用stride=2的卷积做下采样,并且用global average pool层替换了全连接层。
再看一下残差单元,下图是残差单元的基本形式:
resnet实际上使用了两种残差单元,如下图所示,左侧对应的是浅层网络,右侧对应的是深层网络:
对于短路连接,当输入和输出维度一致时,可以直接将输入加到输出上,对应上图的左侧部分,resnet18、resnet34就是采用的这种Residual block,但是当维度不一致时(对应的是维度增加一倍),这就不能直接相加,论文中提出了BottleNeck Residual block,通过使用1x1 conv来巧妙地减小或扩大output channel个数,从而使得我们的3x3 conv的output channel数目不受外界的影响,resnet50、resnet101、resnet152主要就是采用的是这种残差块。
下面用python接口展示一下BottleNeck Residual block的创建,主要的代码如下:
import numpy as np
import onnx
from onnx import helper
from onnx import TensorProto
def make_initializer_tensor(name, dims) -> TensorProto:
value = np.random.random(dims).astype(np.float32)
tensor = helper.make_tensor(
name=name,
data_type=helper.TensorProto.DataType.FLOAT,
dims=list(value.shape),
vals=value.tobytes(),
raw=True
return tensor
input = helper.make_tensor_value_info(
'conv1_input', TensorProto.FLOAT, [1, 128, 56, 56])
# ----------------- Convolution 1x1 -----------------
w1 = make_initializer_tensor("conv1_w", [64, 128, 1, 1])
conv1_output = helper.make_tensor_value_info(
'conv1_output', TensorProto.FLOAT, [1, 64, 56, 56])
conv1 = helper.make_node(
"Conv",
inputs=["conv1_input", "conv1_w"],
outputs=["conv1_output"],
kernel_shape=[1, 1],
strides=[1, 1],
dilations=[1, 1],
group=1,
pads=[0, 0, 0, 0],
relu1_output = helper.make_tensor_value_info(
'relu1_output', TensorProto.FLOAT, [1, 64, 56, 56])
relu1 = helper.make_node(
"Relu", inputs=["conv1_output"], outputs=["relu1_output"])
# ----------------- Convolution 3x3 -----------------
w2 = make_initializer_tensor("conv2_w", [64, 64, 3, 3])
conv2_output = helper.make_tensor_value_info(
'conv2_output', TensorProto.FLOAT, [1, 64, 56, 56])
conv2 = helper.make_node(
"Conv",
inputs=["relu1_output", "conv2_w"],
outputs=["conv2_output"],
kernel_shape=[3, 3],
strides=[1, 1],
dilations=[1, 1],
group=1,
pads=[1, 1, 1, 1],
relu2_output = helper.make_tensor_value_info(
'relu2_output', TensorProto.FLOAT, [1, 64, 56, 56])
relu2 = helper.make_node(
"Relu", inputs=["conv2_output"], outputs=["relu2_output"])
# ----------------- Convolution 1x1 -----------------
w3 = make_initializer_tensor("conv3_w", [128, 64, 1, 1])
conv3_output = helper.make_tensor_value_info(
'conv3_output', TensorProto.FLOAT, [1, 128, 56, 56])
conv3 = helper.make_node(
"Conv",
inputs=["relu2_output", "conv3_w"],
outputs=["conv3_output"],
kernel_shape=[1, 1],
strides=[1, 1],
dilations=[1, 1],
group=1,
pads=[0, 0, 0, 0],
add_output = helper.make_tensor_value_info(
'add_output', TensorProto.FLOAT, [1, 128, 56, 56])
add = helper.make_node(
"Add", inputs=["conv3_output", "conv1_input"], outputs=["add_output"])
# graph and model
graph = helper.make_graph(
nodes=[conv1, relu1, conv2, relu2, conv3, add],
name="residual_block",
inputs=[input],
outputs=[add_output],
initializer=[w1, w2, w3],
value_info=[conv1_output, relu1_output, conv2_output,
relu2_output, conv3_output, add_output]
model = helper.make_model(graph)
# save model
onnx.checker.check_model(model)
onnx.save(model, "bottleneck_residual_block.onnx")
把上面保存的模型用netron可视化之后如下图所示:
如果需要创建整个resnet网络,可以进一步将上面代码进行封装成一个函数,专门用来创建残差块。
3.2 C++接口
onnx没有提供像python一样的C++接口,需要用户自己来写,这个代码写起来相对比较麻烦一些,需要自己来创建填写所有的细节,比如ModelProto的内容、GraphProto的内容、NodeProto的内容等,其中的每一个部分都可能涉及到很多层,比如GraphProto中ValueInfoProto类型的input的创建,里面又包含了TypeProto,TypeProto中又包含Tensor,Tensor中又包含TensorShapeProto,处理起来比较繁琐,下面是一个用C++代码创建Residual block的例子:
#include <google/protobuf/io/zero_copy_stream_impl.h>
#include "misc_common.hpp"
#include "onnx.pb.h"
void setValueInfoProto(onnx::ValueInfoProto* value_info, const string& name,
const vector<int64_t>& shape, int64_t elem_type) {
value_info->set_name(name);
auto* tensor_type = value_info->mutable_type()->mutable_tensor_type();
tensor_type->set_elem_type(elem_type);
auto* tensor_shape = tensor_type->mutable_shape();
for (auto dim : shape) {
auto* dim_proto = tensor_shape->add_dim();
dim_proto->set_dim_value(dim);
void setTensorProto(onnx::TensorProto* tensor, const string& name,
const vector<int64_t>& shape, int64_t elem_type) {
tensor->set_name(name);
tensor->set_data_type(elem_type);
auto size = 1ul;
for (auto dim : shape) {
size *= dim;
tensor->add_dims(dim);
string raw_data(size * sizeof(float), 0);
tensor->set_raw_data(raw_data);
void setConvNode(const string& node_name, onnx::NodeProto* node,
const vector<string>& input_name, const string& output_name,
const vector<int64_t>& kernel_shape,
const vector<int64_t>& strides, const vector<int64_t>& pads) {
node->set_name(node_name);
node->set_op_type("Conv");
for (auto name : input_name) {
node->add_input(name);
node->add_output(output_name);
auto* attr = node->add_attribute();
attr->set_name("kernel_shape");
attr->set_type(onnx::AttributeProto_AttributeType_INTS);
for (auto dim : kernel_shape) {
attr->add_ints(dim);
attr = node->add_attribute();
attr->set_name("strides");
attr->set_type(onnx::AttributeProto_AttributeType_INTS);
for (auto dim : strides) {
attr->add_ints(dim);
attr = node->add_attribute();
attr->set_name("pads");
attr->set_type(onnx::AttributeProto_AttributeType_INTS);
for (auto dim : pads) {
attr->add_ints(dim);
void setReluNode(const string& node_name, onnx::NodeProto* node,
const string& input_name, const string& output_name) {
node->set_name(node_name);
node->set_op_type("Relu");
node->add_input(input_name);
node->add_output(output_name);
void setAddNode(const string& node_name, onnx::NodeProto* node,
const string& input_name0, const string& input_name1,
const string& output_name) {
node->set_name(node_name);
node->set_op_type("Add");
node->add_input(input_name0);
node->add_input(input_name1);
node->add_output(output_name);
int main(int argc, char* argv[]) {
auto model_proto = make_unique<onnx::ModelProto>();
// model information
model_proto->set_ir_version(6);
model_proto->set_domain("");
auto* opset = model_proto->add_opset_import();
opset->set_domain("");
opset->set_version(11);
// graph information, including nodes, inputs, outputs, etc.
auto* graph = model_proto->mutable_graph();
graph->set_name("residual_block");
auto* input = graph->add_input();
auto* output = graph->add_output();
setValueInfoProto(input, "input", {1, 128, 56, 56},
onnx::TensorProto_DataType_FLOAT);
setValueInfoProto(output, "output", {1, 128, 56, 56},
onnx::TensorProto_DataType_FLOAT);
auto* conv1_output = graph->add_value_info();
auto* relu1_output = graph->add_value_info();
auto* conv2_output = graph->add_value_info();
auto* relu2_output = graph->add_value_info();
auto* conv3_output = graph->add_value_info();
setValueInfoProto(conv1_output, "conv1_output", {1, 64, 56, 56},
onnx::TensorProto_DataType_FLOAT);
setValueInfoProto(relu1_output, "relu1_output", {1, 64, 56, 56},
onnx::TensorProto_DataType_FLOAT);
setValueInfoProto(conv2_output, "conv2_output", {1, 64, 56, 56},
onnx::TensorProto_DataType_FLOAT);
setValueInfoProto(relu2_output, "relu2_output", {1, 64, 56, 56},
onnx::TensorProto_DataType_FLOAT);
setValueInfoProto(conv3_output, "conv3_output", {1, 128, 56, 56},
onnx::TensorProto_DataType_FLOAT);
auto* conv1_weight = graph->add_initializer();
auto* conv2_weight = graph->add_initializer();
auto* conv3_weight = graph->add_initializer();
setTensorProto(conv1_weight, "conv1_weight", {64, 128, 1, 1},
onnx::TensorProto_DataType_FLOAT);
setTensorProto(conv2_weight, "conv2_weight", {64, 64, 3, 3},
onnx::TensorProto_DataType_FLOAT);
setTensorProto(conv3_weight, "conv3_weight", {128, 64, 1, 1},
onnx::TensorProto_DataType_FLOAT);
// node information, includes node name, op type, inputs, outputs, etc.
auto* conv1_node = graph->add_node();
auto* relu1_node = graph->add_node();
auto* conv2_node = graph->add_node();
auto* relu2_node = graph->add_node();
auto* conv3_node = graph->add_node();
auto* add_node = graph->add_node();
setConvNode("conv1", conv1_node, vector<string>{"input", "conv1_weight"},
"conv1_output", {1, 1}, {1, 1}, {0, 0, 0, 0});
setReluNode("relu1", relu1_node, "conv1_output", "relu1_output");
setConvNode("conv2", conv2_node,
vector<string>{"relu1_output", "conv2_weight"}, "conv2_output",
{3, 3}, {1, 1}, {1, 1, 1, 1});
setReluNode("relu2", relu2_node, "conv2_output", "relu2_output");
setConvNode("conv3", conv3_node,
vector<string>{"relu2_output", "conv3_weight"}, "conv3_output",
{1, 1}, {1, 1}, {0, 0, 0, 0});
setAddNode("add0", add_node, "input", "conv3_output", "output");
// save model to file
std::ofstream ofs("test.onnx", std::ifstream::out | std::ifstream::binary);
model_proto->SerializeToOstream(&ofs);
return 0;
生成的模型用netron可视化之后如下所示:
四、新特性
这部分内容很多学习自于onnx社区大老师的一次onnx分享,具体的视频链接贴在了最后的reference中,目前onnx最新的版本是1.13,下面这些新的特性并不是都在1.13推出的,像hub在1.11中就有了,parser等在1.12中就已经有了,另外这里也没有把所有的新特性都列出,比如对于training支持的特性就不做描述了,这里只列一些个人感觉比较重要的feature出来。
4.1 hub
hub是onnx官方维护的一个模型仓库,我在网上找了一段感觉比较准确的描述:
The ONNX Model Hub is a simple and fast way to get started with state of the art pre-trained ONNX models from the ONNX Model Zoo. Furthermore, this allows researchers and model developers the opportunity to share their pre-trained models with the broader community.The ONNX Model hub is available after ONNX 1.11.0.
onnx hub包含客户端和服务端,客户端代码位于 https:// github.com/onnx/onnx/bl ob/main/onnx/hub.py ,通过 https:// github.com/onnx/models/ blob/main/ONNX_HUB_MANIFEST.json 这个文件来获得所有的模型metadata信息,下图是网上找的一张onnx hub的基本架构图,感觉画的不太好,但只找到这么一张,只能凑合看:
到写本文为止,我查了一下,一共有171个模型,通过下面代码可以得到所有的模型list:
from onnx import hub
models = hub.list_models()
如果我们想下载resnet50,直接使用 hub.load("resnet50") 即可,也可以通过 hub.get_model_info("resnet50") 来获取这个模型的一些信息,如下所示:
>>> hub.get_model_info("resnet50")
ModelInfo(model=ResNet50, opset=7, path=vision/classification/resnet/model/resnet50-v1-7.onnx, metadata={'model_sha': 'af16a04a6ec48ac494065d4439fe9dea590d337b9ca6dc328160ccf04a217b9c', 'model_bytes': 102583340, 'tags': ['vision', 'classification', 'resnet'], 'io_ports': {'inputs': [{'name': 'data', 'shape': ['N', 3, 224, 224], 'type': 'tensor(float)'}], 'outputs': [{'name': 'resnetv17_dense0_fwd', 'shape': ['N', 1000], 'type': 'tensor(float)'}]}, 'extra_ports': {'features': [{'name': 'resnetv17_pool1_fwd', 'shape': [0, 2048, 1, 1]}]}, 'model_with_data_path': 'vision/classification/resnet/model/resnet50-v1-7.tar.gz', 'model_with_data_sha': '898a9183256e884ae06fb1c11869386eccd38393ab41d9339909e974519a9feb', 'model_with_data_bytes': 95435189})
4.2 parser
在前面第三章中可以看到,无论是使用python接口,还是使用C++接口,创建一个onnx model都很麻烦,onnx parser在一定程度上可以解决这个问题,还以3.1所展示的创建一个残差块为例,使用parser之后,代码可以简化成如下形式:
import onnx
from onnx import parser
m = parser.parse_model("""
ir_version: 8,
opset_import: ["" : 17]
residual_block (float[1,128,56,56] input)=>(float[1,128,56,56] add_output)
conv1_w=Constant()
conv1_output=Conv<kernel_shape=[1, 1],strides=[1, 1],pads=[0, 0, 0, 0]>(input, conv1_w)
relu1_output=Relu(conv1_output)
conv2_w=Constant()
conv2_output=Conv<kernel_shape=[3, 3],strides=[1, 1],pads=[1, 1, 1, 1]>(relu1_output, conv2_w)
relu2_output=Relu(conv2_output)
conv3_w=Constant()
conv3_output=Conv<kernel_shape=[1, 1],strides=[1, 1],pads=[0, 0, 0, 0]>(relu2_output, conv3_w)
add_output=Add(input,conv3_output)
print(m)
onnx.checker.check_model(m)
onnx.save(m, 'residual_bottleneck_block.onnx')
可见代码比3.1中的代码简化了很多,生成的model用netron可视化之后如下图所示:
但是这种方法目前感觉也不够完善,对于conv op来说,所需要的weight输入只能通过Constant表达式才能生成,而且不方便在文本段中给Constant赋值,想到一种方法,就是在文本段外用读文件或者np.random随机weight数据出来,然后在文本段内引用文本段外的id,但是尝试了一下没有成功,也许目前不支持这样做。
对于parser文本段的语法规则在官方的这个文档:
感觉不够直观,需要一些学习才能用起来,不知道以后会不会能改良一下
4.3 function
这部分功能目前了解的不多,大概是可以通过已有的小算子拼出一个大的算子,下面是onnx.proto中对于FunctionProto的定义,省略了一些不重要的字段:
message FunctionProto {
// The name of the function, similar usage of op_type in OperatorProto.
// Combined with FunctionProto.domain, this forms the unique identity of
// the FunctionProto.
optional string name = 1;
// The inputs and outputs of the function.
repeated string input = 4;
repeated string output = 5;
// The attributes of the function.
repeated string attribute = 6;
// The nodes in the function.
repeated NodeProto node = 7;
// The domain which this function belongs to. Combined with FunctionProto.name, this forms the unique identity of
// the FunctionProto.
optional string domain = 10;
}
感觉function有点像一个subgraph,有repeated nodes,有相应的inputs和outputs,他是作为ModelProto的成员被序列化到模型中的:
message ModelProto {
// A list of function protos local to the model.
// Name of the function "FunctionProto.name" should be unique within the domain "FunctionProto.domain".
// In case of any conflicts the behavior (whether the model local functions are given higher priority,
// or standard opserator sets are given higher priotity or this is treated as error) is defined by
// the runtimes.
// The operator sets imported by FunctionProto should be compatible with the ones
// imported by ModelProto and other model local FunctionProtos.
// Example, if same operator set say 'A' is imported by a FunctionProto and ModelProto
// or by 2 FunctionProtos then versions for the operator set may be different but,
// the operator schema returned for op_type, domain, version combination
// for both the versions should be same for every node in the function body.
// One FunctionProto can reference other FunctionProto in the model, however, recursive reference
// is not allowed.
repeated FunctionProto functions = 25;
};
官方文档 ONNX with function 有一个创建function到model的例子:
import numpy
from onnx import numpy_helper, TensorProto
from onnx.helper import (
make_model, make_node, set_model_props, make_tensor,
make_graph, make_tensor_value_info, make_opsetid,
make_function)
from onnx.checker import check_model
new_domain = 'custom'
opset_imports = [make_opsetid("", 14), make_opsetid(new_domain, 1)]
# Let's define a function for a linear regression
node1 = make_node('MatMul', ['X', 'A'], ['XA'])
node2 = make_node('Add', ['XA', 'B'], ['Y'])
linear_regression = make_function(
new_domain, # domain name
'LinearRegression', # function name
['X', 'A', 'B'], # input names
['Y'], # output names
[node1, node2], # nodes
opset_imports, # opsets
[]) # attribute names
# Let's use it in a graph.
X = make_tensor_value_info('X', TensorProto.FLOAT, [None, None])
A = make_tensor_value_info('A', TensorProto.FLOAT, [None, None])
B = make_tensor_value_info('B', TensorProto.FLOAT, [None, None])
Y = make_tensor_value_info('Y', TensorProto.FLOAT, [None])
graph = make_graph(
[make_node('LinearRegression', ['X', 'A', 'B'], ['Y1'], domain=new_domain),
make_node('Abs', ['Y1'], ['Y'])],
'example',
[X, A, B], [Y])
onnx_model = make_model(
graph, opset_imports=opset_imports,
functions=[linear_regression]) # functions to add)
check_model(onnx_model)
# the work is done, let's display it...
print(onnx_model)
大老师的onnx新特性视频中也有一个相关示例,摘录在下面供大家参考:
import onnx.parser
from onnx.onnx_pb import FunctionProto
input = """
agraph (float[N, 128] X) => (float[N, 128] Y)
Y = Softmax(X)
graph = onnx.parser.parse_graph(input)
node = graph.node[0]
schema = onnx.defs.get_schema(node.op_type, node.domain)