LV.1

zzlj

39积分1赞

3 帖子 7 回复 0 收藏

TA的动态

TA的帖子

TA的回复

加急，在MLU270环境中CNML: 7.10.2 0a592c0 CNRT: 4.10.1 a884a9a，是否支持torch.nn.GRUCell，一直提示几个算子不支持报错 我的回复：#4踏雪寻梅回复尊敬的开发者您好，cnnl主要用于3系列相关产品，270暂不支持。好的谢谢 0

求助！MLU270过热保护问题还是PCI接口接触不良问题？ 我的回复：补充dmesg相关信息： 0

服务器上使用MLU270-F4，连续开机一周后使用cnmon命令出错 我的回复：#3sunxiaofeng回复之前用那个被动散热的卡，有时候跑着跑着就死掉了，我感觉温度有点影响的MLU270是被动散热，最近在使用时经常挂掉，然后从lspci查看设备也掉了，请问这是温度原因嘛还是什么，每次都需要重新插拔才行 0

MLU270驱动问题 我的回复：#1止战之殇回复重新插拔可以运行一会您好，遇到了同样的情况，请问您是怎么解决的？ 0

加急，在MLU270环境中CNML: 7.10.2 0a592c0 CNRT: 4.10.1 a884a9a，是否支持torch.nn.GRUCell，一直提示几个算子不支持报错 我的回复：#2zzlj回复好的明白，谢谢您好，麻烦您，我还有一个问题，MLU270是不是不支持cnnl？ 0

加急，在MLU270环境中CNML: 7.10.2 0a592c0 CNRT: 4.10.1 a884a9a，是否支持torch.nn.GRUCell，一直提示几个算子不支持报错 我的回复：#1踏雪寻梅回复尊敬的开发者您好，目前暂不支持GRUCell，已支持GRU算子。如需使用GRUCell，可参考Pytorch User Guide添加自定义算子实现。展开好的明白，谢谢 0

量化之后运行提示不能在mlu上运行还是在cpu上运行 我的回复：#1老黄牛回复应该是有mlu不支持的算子，在线模式推理的时候自动fallback到cpu上运行了这种情况会导致无法逐层、融合，以及生成离线模型，请问该怎么做呢？ 0

算子不适配导致RuntimeError: outputs_[i]->uses().empty() INTERNAL ASSERT FAILED at /pytorch/torch/csrc/jit/ir.cpp:1027, please report a bug to PyTorch. (eraseOutput 我的回复：from functools import partial from types import SimpleNamespace as SN import torch import torch as th import torch.nn as nn import torch_mlu import torch_mlu.core.mlu_model as ct import torch_mlu.core.mlu_quantize as mlu_quantize import torchvision.models as models ct.set_core_number(1) ct.set_core_version("MLU270") torch.set_grad_enabled(False) class RnnAgent(nn.Module): def __init__(self, obs_shape, n_actions, args): super(RnnAgent, self).__init__() self._n_layers = args.n_layers self._hidden_size = args.hidden_size layers = [nn.Linear(obs_shape, self._hidden_size), nn.ReLU()] for l in range(self._n_layers - 1): layers += [nn.Linear(self._hidden_size, self._hidden_size), nn.ReLU()] self.enc = nn.Sequential(*layers) self.rnn = nn.GRUCell(self._hidden_size, self._hidden_size) self.f_out = nn.Linear(self._hidden_size, n_actions) def init_hidden(self): return th.zeros(1, self._hidden_size) def forward(self, x): y = self.enc(x['obs']) h = self.rnn(y, x['h']) y = self.f_out(h) return y, h if __name__ == '__main__': device = torch.device('cpu') # print(device) args = SN() args.n_layers = 2 args.hidden_size = 256 obs_shape = 82 n_actions = 5 policy_net = RnnAgent(obs_shape=obs_shape, n_actions=n_actions, args=args).to(device) target_net = RnnAgent(obs_shape=obs_shape, n_actions=n_actions, args=args).to(device) policy_net.eval() # print(torch.__version__) input_o = torch.rand((1, 82), dtype=torch.float) input_h = torch.rand((1, 256), dtype=torch.float) policy_net.load_state_dict(torch.load('./cp_epoch50.pth', map_location='cpu'), False) net_quantization = mlu_quantize.quantize_dynamic_mlu(policy_net, dtype='int8', gen_quant=True) x = {'obs':input_o, 'h':input_h} output = net_quantization(x) x, h = output # CPU quantization infer # print(type(x), type(h)) # print(x) torch.save(net_quantization.state_dict(), 'policyNet_cp_quantization.pth') print(ct.mlu_device()) net_quantization = mlu_quantize.quantize_dynamic_mlu(policy_net) net_quantization.load_state_dict(torch.load('policyNet_cp_quantization.pth'), False) net_quantization.to(ct.mlu_device()) input_h_mlu = input_h.to(ct.mlu_device()) input_o_mlu = input_o.to(ct.mlu_device()) x_mlu = {'obs':input_o_mlu, 'h':input_h_mlu} # MLU layer-by-layer infer output = net_quantization(x_mlu) x, h = output print(type(x), type(h)) print(x.cpu()) # fusion infer ct.save_as_cambricon("policyNet_offline") traced_model = torch.jit.trace(net_quantization, x_mlu, check_trace=False) x, h = traced_model(x_mlu) print("-------------_+++++++++++++++++++++++++-----------------------------------++++++++++++++++++++++++++++++++++++++++++++++++++=--------------------------", x.cpu())我的代码 1