项目场景:
multihead-attention训练
out = torch.einsum('b h d e, b h d n -> b h e n', context, q)
File "/root/anaconda3/lib/python3.7/site-packages/torch/functional.py", line 327, in einsum
return _VF.einsum(equation, operands) # type: ignore[attr-defined]
RuntimeError: einsum(): operands do not broadcast with remapped shapes [original->remapped]: [3, 4, 32, 32]->[3, 4, 32, 1, 32] [2, 4, 32, 65536]->[2, 4, 1, 65536, 32]
问题描述
完整报错:
Traceback (most recent call last):
File "multi_train.py", line 38, in <module>
trainer.train()
File "/root/sketchMultimodal/denoising-diffusion-pytorch-master/denoising_diffusion_pytorch/denoising_diffusion_pytorch.py", line 661, in train
all_images_list = list(map(lambda n: self.ema_model.sample(condition=data_c, batch_size=n), batches))
File "/root/sketc