添加链接
link之家
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I am using OpenCV (4.6.0) DNN module to generate semantic segmentation of images and the output of the network is a cv::Mat with the size of (numberOfClasses x image_height x image_width) that contains the class probabilities for every pixel.

I want to calculate the class ID that has the highest probability for every pixel.

In Python, the function numpy.argmax(src_matrix, axis=0) gives the desired output.

In C++ OpenCV, the function cv::reduceArgMax(src_, dst_, axis_) can calculate the same thing only on the 2D matrices. Therefore, I tried to get 2D slices ( (image_width x numberOfClasses) or ( numberOfClasses x image_height) ) from the 3D matrix and calculate the argmax on those slices in a loop. However I could not get the correct slices.

Example Code

int sizes[] = {numberOfClasses, imageHeight, imageWidth};
cv::Mat probabilityMatrix(3, sizes, CV_32F);
cv::Mat argMaxOfSlice(image_width, 1);
for(int r = 0; r < probabilityMatrix.size[1]; r++){
// cv::Mat Slice = get a 2D slice of the size (image_width x numberOfClasses) from the row r
// cv::reduceArgMax(Slice, argMaxOfSlice, 1);

Preferably, I just want to use OpenCV libraries but I can also use Eigen (3.2.10).

EDIT:

Python Example Code along with example input:

import numpy as np
# Shape of the example_input (3x3x4) where (ch, row, col)
example_input = np.array([[[ -1,  0,  -1,  -1],
                           [ 0,  -1,  -1,  0],
                           [ 0,  -1,  -1,  -1]],
                          [[ -1,  -1,  1,  1],
                           [ -1,  -1,  -1,  -1],
                           [ 1,  -1,  1,  -1]],
                          [[ 2,  -1,  -1,  -1],
                           [ -1,  2,  2,  -1],
                           [ -1,  2,  -1,  2]]])
expected_output = np.array([[ 2,  0,  1,  1],
                            [ 0,  2,  2,  0],
                            [ 1,  2,  1,  2]])
function_output = np.argmax(example_input, axis=0)
if np.count_nonzero(expected_output - function_output) > 0 : 
    print("Something wrong")
else:
    print("Correct")

C++ OpenCV Example Input and Expected Output

int example_size[3] = {3, 3, 4};
float example_input_data[36] = { -1,  0,  -1,  0, 0,  -1,  -1,  0,  0,  -1,  -1,  -1, -1,  -1,  1,  1, -1,  -1,  -1,  -1,
                            1,  -1,  1,  -1, 2,  -1,  -1,  -1, -1,  2,  2,  -1, -1,  2,  -1,  2};
cv::Mat example_input (3, example_size, CV_32F,  example_input_data);
int expected_output_data[12] = { 2,  0,  1,  1, 0,  2,  2,  0, 1,  2,  1,  2};
cv::Mat expected_output (3, 4, CV_16U, expected_output_data);

Thank you

Reshape into a 2D Mat, with numberOfClasses rows and imageHeight * imageWidth columns. Now each column stores probabilities for single pixel. Next, single call to reduceArgMax to reduce it into a single row. Finally reshape into a 2D Mat with imageHeight rows and imageWidth columns. – Dan Mašek Nov 28, 2022 at 22:30 If you provide a proper minimal reproducible example, with sample input (something like a 4x4x4 matrix of probabilities will do) and expected output (or just provide a Python sample with same input), I'll write up a proper answer with working code example ;) | NB: It's handy to be aware of the in-memory layout of the data you're working with (as well as the one of cv::Mat). Often you can "massage" the data a little bit like I've done above and use functions that wouldn't otherwise work. – Dan Mašek Nov 28, 2022 at 22:43

Thanks to @DanMašek the implementation I did is the following:

cv::Mat reshaped = network_out.reshape(1, numberOfClasses);
cv::Mat argmax_row_matrix;
cv::reduceArgMax(reshaped, argmax_row_matrix, 0);
cv::Mat argmax_image_shape = argmax_row_matrix.reshape(1,rows);

However, this implementation runs slower than the following:

cv::Mat classID = cv::Mat::zeros(rows, cols, CV_32S);
cv::Mat maxVal(rows, cols, CV_32F, network_out.data);
for (int ch = 0; ch < chns; ch++){
    for (int row = 0; row < rows; row++){
        const float *ptrScore = network_out.ptr<float>(0, ch, row);
        int *ptrMaxCl = classID.ptr<int>(row);
        float *ptrMaxVal = maxVal.ptr<float>(row);
        for (int col = 0; col < cols; col++){
            if (ptrScore[col] > ptrMaxVal[col]){
                ptrMaxVal[col] = ptrScore[col];
                ptrMaxCl[col] = ch;
        

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.