machine learning - TypeError: add(): argument 'other' (position 1) must be Tensor, not numpy.ndarray

link之家

链接快照平台

输入网页链接，自动生成快照
标签化管理网页链接

Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I am testing a ResNet-34 trained_model using Pytorch and fastai on a linux system with the latest anaconda3. To run it as a batch job, I commented out the gui related lines. It started to run for a few hrs, then stopped in the Validation step, the error message is as below.

^M100%|█████████▉| 452/453 [1:07:07<00:08, 8.75s/it, loss=1.23]^[[A^[[A^[[A ^MValidation: 0%| | 0/40 [00:00<?, ?it/s]^[[A^[[A^[[ATraceback (most recent call last): File "./resnet34_pretrained_PNG_nogui_2.py", line 279, in <module> learner.fit(lr,1,callbacks=[f1_callback]) File "/project/6000192/jemmyhu/resnet_png/fastai/learner.py", line 302, in fit return self.fit_gen(self.model, self.data, layer_opt, n_cycle, **kwargs) File "/project/6000192/jemmyhu/resnet_png/fastai/learner.py", line 249, in fit_gen swa_eval_freq=swa_eval_freq, **kwargs) File "/project/6000192/jemmyhu/resnet_png/fastai/model.py", line 162, in vals = validate(model_stepper, cur_data.val_dl, metrics, epoch, seq_first=seq_first, validate_skip = validate_skip) File "/project/6000192/jemmyhu/resnet_png/fastai/model.py", line 241, in validate res.append([to_np(f(datafy(preds), datafy(y))) for f in metrics]) File "/project/6000192/jemmyhu/resnet_png/fastai/model.py", line 241, in <listcomp> res.append([to_np(f(datafy(preds), datafy(y))) for f in metrics]) File "./resnet34_pretrained_PNG_nogui_2.py", line 237, in __call__ self.TP += (preds*targs).float().sum(dim=0) TypeError: add(): argument 'other' (position 1) must be Tensor, not numpy.ndarray

The link for the original code is https://www.kaggle.com/iafoss/pretrained-resnet34-with-rgby-0-460-public-lb

lines 279 and 237 in my copy are shown below:

226 class F1:
227     __name__ = 'F1 macro'
228     def __init__(self,n=28):
229         self.n = n
230         self.TP = np.zeros(self.n)
231         self.FP = np.zeros(self.n)
232         self.FN = np.zeros(self.n)
234     def __call__(self,preds,targs,th=0.0):
235         preds = (preds > th).int()
236         targs = targs.int()
237         self.TP += (preds*targs).float().sum(dim=0)
238         self.FP += (preds > targs).float().sum(dim=0)
239         self.FN += (preds < targs).float().sum(dim=0)
240         score = (2.0*self.TP/(2.0*self.TP + self.FP + self.FN + 1e-6)).mean()
241         return score
276 lr = 0.5e-2
277 with warnings.catch_warnings():
278     warnings.simplefilter("ignore")
279     learner.fit(lr,1,callbacks=[f1_callback])
Could anyone have a clue for the issue? 
Many thanks,
Jemmy
                Can you - if you are not specifying the code that throws the error yourself - at least include a link to the repository, or otherwise make it clear what exactly you are looking at? And have you looked at the respective lines, and run the validation step only in a debugger?
– dennlinger
                Dec 27, 2018 at 6:49
                Alright, thanks for the update! I also might have an hint towards the solution: It seems to me that the lines 237-239 are calling a PyTorch function .sum() (which also makes sense that the .add() function throws an error in this context, but your input to this has already been cast to a NumPy array before! Not knowing exactly what input types preds or targs are, I can only guess, but I'll try to draft up some solution.
– dennlinger
                Dec 27, 2018 at 18:35
I have had the same issue with this Kaggle kernel. My workarounds are the following:
1st option: In the F1 __call__ method convert preds and targs from pytorch tensors to numpy arrays;
2nd option: Initialise TP/FP/FN with pytorch tensors instead of numpy arrays, i.e. replace np.zeros(self.n) with torch.zeros(1, self.n).
Basically, the main idea - all variables should be of the same type.
        Thanks for contributing an answer to Stack Overflow!
Please be sure to answer the question. Provide details and share your research!
But avoid …
Asking for help, clarification, or responding to other answers.
Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.