ppyoloe问题
07/19 22:35:09 - mmengine - INFO - Epoch(train) [4][ 60/580] base_lr: 2.5000e-04 lr: 1.5509e-04 eta: 4 days, 0:
55:13 time: 6.2245 data_time: 0.4387 memory: 1211 loss: 4.0579 loss_cls: 0.9532 loss_state: 1.2636 loss_bbo
x: 0.3440 loss_dfl: 1.4971
07/19 22:36:52 - mmengine - INFO - Epoch(train) [4][ 70/580] base_lr: 2.5000e-04 lr: 1.5595e-04 eta: 4 days, 1:
04:14 time: 7.6080 data_time: 0.4719 memory: 5738 loss: 4.0051 loss_cls: 0.9280 loss_state: 1.2617 loss_bbo
x: 0.3368 loss_dfl: 1.4786
07/19 22:37:58 - mmengine - INFO - Epoch(train) [4][ 80/580] base_lr: 2.5000e-04 lr: 1.5681e-04 eta: 4 days, 0:
58:00 time: 8.1634 data_time: 0.0477 memory: 1211 loss: 4.0580 loss_cls: 0.9506 loss_state: 1.2706 loss_bbo
x: 0.3783 loss_dfl: 1.4584
07/19 22:39:41 - mmengine - INFO - Epoch(train) [4][ 90/580] base_lr: 2.5000e-04 lr: 1.5767e-04 eta: 4 days, 1:
06:24 time: 9.7714 data_time: 0.0519 memory: 1211 loss: 4.0740 loss_cls: 0.9612 loss_state: 1.2761 loss_bbo
x: 0.3790 loss_dfl: 1.4578
07/19 22:41:32 - mmengine - INFO - Epoch(train) [4][100/580] base_lr: 2.5000e-04 lr: 1.5853e-04 eta: 4 days, 1:
18:20 time: 10.1447 data_time: 0.0393 memory: 1299 loss: 4.0076 loss_cls: 0.9446 loss_state: 1.2327 loss_bb
ox: 0.3803 loss_dfl: 1.4501
07/19 22:43:03 - mmengine - INFO - Epoch(train) [4][110/580] base_lr: 2.5000e-04 lr: 1.5940e-04 eta: 4 days, 1:
21:53 time: 9.4774 data_time: 0.0465 memory: 1299 loss: 4.0742 loss_cls: 0.9750 loss_state: 1.2301 loss_bbo
x: 0.4177 loss_dfl: 1.4514
07/19 22:43:44 - mmengine - INFO - Epoch(train) [4][120/580] base_lr: 2.5000e-04 lr: 1.6026e-04 eta: 4 days, 1:
05:55 time: 8.2485 data_time: 0.0239 memory: 1123 loss: 4.2037 loss_cls: 1.0055 loss_state: 1.2765 loss_bbo
x: 0.4376 loss_dfl: 1.4840
07/19 22:44:31 - mmengine - INFO - Epoch(train) [4][130/580] base_lr: 2.5000e-04 lr: 1.6112e-04 eta: 4 days, 0:
52:08 time: 7.8639 data_time: 0.0401 memory: 1299 loss: 4.1204 loss_cls: 0.9918 loss_state: 1.2751 loss_bbo
x: 0.4120 loss_dfl: 1.4415
07/19 22:45:00 - mmengine - INFO - Epoch(train) [4][140/580] base_lr: 2.5000e-04 lr: 1.6198e-04 eta: 4 days, 0:
31:01 time: 6.3823 data_time: 0.0468 memory: 1299 loss: 4.1752 loss_cls: 1.0122 loss_state: 1.2948 loss_bbo
x: 0.4150 loss_dfl: 1.4531
07/19 22:45:53 - mmengine - INFO - Epoch(train) [4][150/580] base_lr: 2.5000e-04 lr: 1.6284e-04 eta: 4 days, 0:
20:00 time: 5.2226 data_time: 0.0467 memory: 1211 loss: 4.1133 loss_cls: 0.9960 loss_state: 1.2773 loss_bbo
x: 0.3973 loss_dfl: 1.4427
07/19 22:46:23 - mmengine - INFO - Epoch(train) [4][160/580] base_lr: 2.5000e-04 lr: 1.6371e-04 eta: 3 days, 23
:59:52 time: 4.0014 data_time: 0.0396 memory: 1299 loss: 4.1325 loss_cls: 0.9886 loss_state: 1.2972 loss_bb
ox: 0.3905 loss_dfl: 1.4561
07/19 22:47:54 - mmengine - INFO - Epoch(train) [4][170/580] base_lr: 2.5000e-04 lr: 1.6457e-04 eta: 4 days, 0:
04:03 time: 4.9978 data_time: 0.0289 memory: 1123 loss: 4.0423 loss_cls: 0.9726 loss_state: 1.2667 loss_bbo
x: 0.3852 loss_dfl: 1.4178
07/19 22:49:26 - mmengine - INFO - Epoch(train) [4][180/580] base_lr: 2.5000e-04 lr: 1.6543e-04 eta: 4 days, 0:
07:57 time: 5.8808 data_time: 0.0185 memory: 1299 loss: 4.1775 loss_cls: 1.0039 loss_state: 1.2959 loss_bbo
x: 0.3878 loss_dfl: 1.4899
07/19 22:49:50 - mmengine - INFO - Epoch(train) [4][190/580] base_lr: 2.5000e-04 lr: 1.6629e-04 eta: 3 days, 23
:46:10 time: 5.8069 data_time: 0.0079 memory: 1211 loss: 4.1025 loss_cls: 0.9986 loss_state: 1.2759 loss_bb
ox: 0.3874 loss_dfl: 1.4406
07/19 22:51:30 - mmengine - INFO - Epoch(train) [4][200/580] base_lr: 2.5000e-04 lr: 1.6716e-04 eta: 3 days, 23
:53:28 time: 6.7419 data_time: 0.0106 memory: 1299 loss: 4.0270 loss_cls: 0.9840 loss_state: 1.2463 loss_bb
ox: 0.3891 loss_dfl: 1.4076
07/19 23:02:37 - mmengine - INFO - Epoch(train) [4][210/580] base_lr: 2.5000e-04 lr: 1.6802e-04 eta: 4 days, 3:
35:58 time: 19.4827 data_time: 0.0172 memory: 1299 loss: 3.9365 loss_cls: 0.9562 loss_state: 1.1972 loss_bb
ox: 0.3909 loss_dfl: 1.3922
07/19 23:02:56 - mmengine - INFO - Epoch(train) [4][220/580] base_lr: 2.5000e-04 lr: 1.6888e-04 eta: 4 days, 3:
11:23 time: 18.0313 data_time: 0.0172 memory: 1123 loss: 3.9469 loss_cls: 0.9466 loss_state: 1.1939 loss_bb
ox: 0.3961 loss_dfl: 1.4104
07/19 23:04:27 - mmengine - INFO - Epoch(train) [4][230/580] base_lr: 2.5000e-04 lr: 1.6974e-04 eta: 4 days, 3:
14:05 time: 18.0290 data_time: 0.0515 memory: 1809 loss: 3.8146 loss_cls: 0.8968 loss_state: 1.1688 loss_bb
ox: 0.3726 loss_dfl: 1.3764
07/19 23:06:09 - mmengine - INFO - Epoch(train) [4][240/580] base_lr: 2.5000e-04 lr: 1.7060e-04 eta: 4 days, 3:
20:38 time: 19.5705 data_time: 0.0513 memory: 1299 loss: 3.8529 loss_cls: 0.8854 loss_state: 1.1818 loss_bb
ox: 0.3745 loss_dfl: 1.4112
07/19 23:07:30 - mmengine - INFO - Epoch(train) [4][250/580] base_lr: 2.5000e-04 lr: 1.7147e-04 eta: 4 days, 3:
19:33 time: 19.1939 data_time: 0.0699 memory: 1123 loss: 3.9280 loss_cls: 0.9007 loss_state: 1.2024 loss_bb
ox: 0.3729 loss_dfl: 1.4521
07/19 23:09:59 - mmengine - INFO - Exp name: ppyoloe_plus_s_fast_8xb8-80e_coco_20240719_184020
07/19 23:09:59 - mmengine - INFO - Epoch(train) [4][260/580] base_lr: 2.5000e-04 lr: 1.7233e-04 eta: 4 days, 3:
43:49 time: 8.8543 data_time: 0.0840 memory: 1299 loss: 4.0175 loss_cls: 0.9366 loss_state: 1.2415 loss_bbo
x: 0.3666 loss_dfl: 1.4728
07/19 23:10:29 - mmengine - INFO - Epoch(train) [4][270/580] base_lr: 2.5000e-04 lr: 1.7319e-04 eta: 4 days, 3:
23:24 time: 9.0525 data_time: 0.0855 memory: 1125 loss: 3.9124 loss_cls: 0.9142 loss_state: 1.1960 loss_bbo
x: 0.3407 loss_dfl: 1.4615
07/19 23:11:15 - mmengine - INFO - Epoch(train) [4][280/580] base_lr: 2.5000e-04 lr: 1.7405e-04 eta: 4 days, 3:
09:43 time: 8.1679 data_time: 0.0544 memory: 1299 loss: 3.9072 loss_cls: 0.9227 loss_state: 1.1882 loss_bbo
x: 0.3505 loss_dfl: 1.4458
07/19 23:12:09 - mmengine - INFO - Epoch(train) [4][290/580] base_lr: 2.5000e-04 lr: 1.7491e-04 eta: 4 days, 2:
58:39 time: 7.2122 data_time: 0.0544 memory: 1139 loss: 3.8454 loss_cls: 0.9033 loss_state: 1.1660 loss_bbo
x: 0.3427 loss_dfl: 1.4334
07/19 23:14:07 - mmengine - INFO - Epoch(train) [4][300/580] base_lr: 2.5000e-04 lr: 1.7578e-04 eta: 4 days, 3:
10:56 time: 7.9462 data_time: 0.0460 memory: 1299 loss: 3.7560 loss_cls: 0.8613 loss_state: 1.1528 loss_bbo
x: 0.3260 loss_dfl: 1.4159
07/19 23:16:20 - mmengine - INFO - Epoch(train) [4][310/580] base_lr: 2.5000e-04 lr: 1.7664e-04 eta: 4 days, 3:
28:22 time: 7.6028 data_time: 0.0254 memory: 1211 loss: 3.7026 loss_cls: 0.8481 loss_state: 1.1537 loss_bbo
x: 0.3134 loss_dfl: 1.3874
07/19 23:16:56 - mmengine - INFO - Epoch(train) [4][320/580] base_lr: 2.5000e-04 lr: 1.7750e-04 eta: 4 days, 3:
10:56 time: 7.7392 data_time: 0.0238 memory: 1299 loss: 3.7546 loss_cls: 0.8488 loss_state: 1.1982 loss_bbo
x: 0.3319 loss_dfl: 1.3758
07/19 23:18:29 - mmengine - INFO - Epoch(train) [4][330/580] base_lr: 2.5000e-04 lr: 1.7836e-04 eta: 4 days, 3:
14:21 time: 8.6788 data_time: 0.0149 memory: 963 loss: 3.8670 loss_cls: 0.8823 loss_state: 1.2236 loss_bbox
: 0.3470 loss_dfl: 1.4140
07/19 23:19:29 - mmengine - INFO - Epoch(train) [4][340/580] base_lr: 2.5000e-04 lr: 1.7922e-04 eta: 4 days, 3:
05:36 time: 8.7994 data_time: 0.0149 memory: 1044 loss: 3.8786 loss_cls: 0.8796 loss_state: 1.2530 loss_bbo
x: 0.3386 loss_dfl: 1.4073
07/19 23:22:00 - mmengine - INFO - Epoch(train) [4][350/580] base_lr: 2.5000e-04 lr: 1.8009e-04 eta: 4 days, 3:
29:13 time: 9.4643 data_time: 0.0053 memory: 1299 loss: 4.1052 loss_cls: 0.9432 loss_state: 1.3361 loss_bbo
x: 0.3719 loss_dfl: 1.4541
07/19 23:22:51 - mmengine - INFO - Epoch(train) [4][360/580] base_lr: 2.5000e-04 lr: 1.8095e-04 eta: 4 days, 3:
17:08 time: 7.8188 data_time: 0.0102 memory: 963 loss: 4.0853 loss_cls: 0.9371 loss_state: 1.3109 loss_bbox
: 0.3897 loss_dfl: 1.4476
07/19 23:25:08 - mmengine - INFO - Epoch(train) [4][370/580] base_lr: 2.5000e-04 lr: 1.8181e-04 eta: 4 days, 3:
35:30 time: 9.8407 data_time: 0.0102 memory: 1299 loss: 4.1099 loss_cls: 0.9707 loss_state: 1.3064 loss_bbo
x: 0.3791 loss_dfl: 1.4537
07/19 23:26:01 - mmengine - INFO - Epoch(train) [4][380/580] base_lr: 2.5000e-04 lr: 1.8267e-04 eta: 4 days, 3:
24:44 time: 9.0422 data_time: 0.0134 memory: 1419 loss: 3.9697 loss_cls: 0.9292 loss_state: 1.2594 loss_bbo
x: 0.3538 loss_dfl: 1.4273
[E ProcessGroupNCCL.cpp:587] [Rank 0] Watchdog caught collective operation timeout: WorkNCCL(OpType=_ALLGATHER_BAS
E, Timeout(ms)=1800000) ran for 1807354 milliseconds before timing out.
[E ProcessGroupNCCL.cpp:587] [Rank 3] Watchdog caught collective operation timeout: WorkNCCL(OpType=_ALLGATHER_BAS
E, Timeout(ms)=1800000) ran for 1807440 milliseconds before timing out.
[E ProcessGroupNCCL.cpp:587] [Rank 2] Watchdog caught collective operation timeout: WorkNCCL(OpType=_ALLGATHER_BAS
E, Timeout(ms)=1800000) ran for 1807458 milliseconds before timing out.
Traceback (most recent call last):
File "./tools/train.py", line 126, in <module>
Traceback (most recent call last):
File "./tools/train.py", line 126, in <module>
main()
File "./tools/train.py", line 122, in main
runner.train()
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1746, in train
main()
File "./tools/train.py", line 122, in main
runner.train()
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1746, in train
model = self.train_loop.run() # type: ignore
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/runner/loops.py", line 96, in run
self.run_epoch()
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/runner/loops.py", line 112, in run_epoch
model = self.train_loop.run() # type: ignore
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/runner/loops.py", line 96, in run
self.run_iter(idx, data_batch)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/runner/loops.py", line 128, in run_iter
self.run_epoch()
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/runner/loops.py", line 112, in run_epoch
outputs = self.runner.model.train_step(
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/model/wrappers/distributed.py", line 121
, in train_step
self.run_iter(idx, data_batch)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/runner/loops.py", line 128, in run_iter
losses = self._run_forward(data, mode='loss')
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/model/wrappers/distributed.py", line 161
, in _run_forward
outputs = self.runner.model.train_step(
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/model/wrappers/distributed.py", line 121
, in train_step
results = self(**data, mode=mode)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
losses = self._run_forward(data, mode='loss')
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/model/wrappers/distributed.py", line 161
, in _run_forward
results = self(**data, mode=mode)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 886, in f
orward
return forward_call(*input, **kwargs)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 886, in f
orward
output = self.module(*inputs[0], **kwargs[0])
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
output = self.module(*inputs[0], **kwargs[0])
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/hl/mi/mmdet/models/detectors/base.py", line 92, in forward
return forward_call(*input, **kwargs)
File "/home/lsw/hl/mi/mmdet/models/detectors/base.py", line 92, in forward
return self.loss(inputs, data_samples)
File "/home/lsw/hl/mi/mmdet/models/detectors/single_stage.py", line 77, in loss
return self.loss(inputs, data_samples)
File "/home/lsw/hl/mi/mmdet/models/detectors/single_stage.py", line 77, in loss
x = self.extract_feat(batch_inputs)
File "/home/lsw/hl/mi/mmdet/models/detectors/single_stage.py", line 157, in extract_feat
x = self.extract_feat(batch_inputs)
File "/home/lsw/hl/mi/mmdet/models/detectors/single_stage.py", line 157, in extract_feat
x = self.backbone(batch_inputs)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
x = self.backbone(batch_inputs)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/hl/mi/mmyolo/models/backbones/base_backbone.py", line 221, in forward
x = layer(x)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/hl/mi/mmyolo/models/backbones/base_backbone.py", line 221, in forward
x = layer(x)return forward_call(*input, **kwargs)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forw
ard
input = module(input)return forward_call(*input, **kwargs)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forw
ard
input = module(input)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/hl/mi/mmyolo/models/layers/yolo_bricks.py", line 1295, in forward
return forward_call(*input, **kwargs)
File "/home/lsw/hl/mi/mmyolo/models/layers/yolo_bricks.py", line 1295, in forward
y2 = self.blocks(self.conv2(x))
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
y2 = self.blocks(self.conv2(x))
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forw
ard
input = module(input)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forw
ard
input = module(input)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/hl/mi/mmyolo/models/layers/yolo_bricks.py", line 1154, in forward
return forward_call(*input, **kwargs)
File "/home/lsw/hl/mi/mmyolo/models/layers/yolo_bricks.py", line 1154, in forward
y = self.conv2(y)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/hl/mi/mmyolo/models/layers/yolo_bricks.py", line 252, in forward
y = self.conv2(y)self.rbr_dense(inputs) +
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmcv/cnn/bricks/conv_module.py", line 281, in for
ward
return forward_call(*input, **kwargs)
File "/home/lsw/hl/mi/mmyolo/models/layers/yolo_bricks.py", line 252, in forward
self.rbr_dense(inputs) +
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
x = self.norm(x)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 749, in forw
ard
return forward_call(*input, **kwargs)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmcv/cnn/bricks/conv_module.py", line 281, in for
ward
x = self.norm(x)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 749, in forw
ard
return sync_batch_norm.apply(
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/_functions.py", line 42, in forw
ard
return sync_batch_norm.apply(
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/_functions.py", line 42, in forw
ard
dist._all_gather_base(combined_flat, combined, process_group, async_op=False)
dist._all_gather_base(combined_flat, combined, process_group, async_op=False) File "/home/lsw/miniconda3/envs
/mi/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 2070, in _all_gather_base
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 2070
, in _all_gather_base
work = group._allgather_base(output_tensor, input_tensor)
RuntimeError: NCCL communicator was aborted on rank 0. Original reason for failure was: [Rank 0] Watchdog caught
collective operation timeout: WorkNCCL(OpType=_ALLGATHER_BASE, Timeout(ms)=1800000) ran for 1807354 milliseconds b
efore timing out.
work = group._allgather_base(output_tensor, input_tensor)
RuntimeError: NCCL communicator was aborted on rank 3. Original reason for failure was: [Rank 3] Watchdog caught
collective operation timeout: WorkNCCL(OpType=_ALLGATHER_BASE, Timeout(ms)=1800000) ran for 1807440 milliseconds b
efore timing out.
Traceback (most recent call last):
File "./tools/train.py", line 126, in <module>
main()
File "./tools/train.py", line 122, in main
runner.train()
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1746, in train
model = self.train_loop.run() # type: ignore
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/runner/loops.py", line 96, in run
self.run_epoch()
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/runner/loops.py", line 112, in run_epoch
self.run_iter(idx, data_batch)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/runner/loops.py", line 128, in run_iter
outputs = self.runner.model.train_step(
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/model/wrappers/distributed.py", line 121
, in train_step
losses = self._run_forward(data, mode='loss')
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/model/wrappers/distributed.py", line 161
, in _run_forward
results = self(**data, mode=mode)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 886, in f
orward
output = self.module(*inputs[0], **kwargs[0])
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/hl/mi/mmdet/models/detectors/base.py", line 92, in forward
return self.loss(inputs, data_samples)
File "/home/lsw/hl/mi/mmdet/models/detectors/single_stage.py", line 77, in loss
x = self.extract_feat(batch_inputs)
File "/home/lsw/hl/mi/mmdet/models/detectors/single_stage.py", line 157, in extract_feat
x = self.backbone(batch_inputs)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/hl/mi/mmyolo/models/backbones/base_backbone.py", line 221, in forward
x = layer(x)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forw
ard
input = module(input)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/hl/mi/mmyolo/models/layers/yolo_bricks.py", line 1295, in forward
y2 = self.blocks(self.conv2(x))
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forw
ard
input = module(input)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/hl/mi/mmyolo/models/layers/yolo_bricks.py", line 1154, in forward
y = self.conv2(y)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/hl/mi/mmyolo/models/layers/yolo_bricks.py", line 252, in forward
self.rbr_dense(inputs) +
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmcv/cnn/bricks/conv_module.py", line 281, in for
ward
x = self.norm(x)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_
impl
return forward_call(*input, **kwargs)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 749, in forw
ard
return sync_batch_norm.apply(
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/_functions.py", line 42, in forw
ard
dist._all_gather_base(combined_flat, combined, process_group, async_op=False)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 2070
, in _all_gather_base
work = group._allgather_base(output_tensor, input_tensor)
RuntimeError: NCCL communicator was aborted on rank 2. Original reason for failure was: [Rank 2] Watchdog caught
collective operation timeout: WorkNCCL(OpType=_ALLGATHER_BASE, Timeout(ms)=1800000) ran for 1807458 milliseconds b
efore timing out.
[E ProcessGroupNCCL.cpp:341] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA
kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are ta
king the entire process down.
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 13725 closing signal SIGTERM
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 13724) of binary: /ho
me/lsw/miniconda3/envs/mi/bin/python
Traceback (most recent call last):
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in <modul
e>
main()
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/distributed/run.py", line 710, in run
elastic_launch(
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in
__call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 259, in
launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
./tools/train.py FAILED
------------------------------------------------------------
Failures:
[1]:
time : 2024-07-19_23:57:21
host : lswPlus
rank : 2 (local_rank: 2)
exitcode : 1 (pid: 13726)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[2]:
time : 2024-07-19_23:57:21
host : lswPlus
rank : 3 (local_rank: 3)
exitcode : -6 (pid: 13727)
error_file: <N/A>
traceback : Signal 6 (SIGABRT) received by PID 13727
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2024-07-19_23:57:21
host : lswPlus
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 13724)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
dist.init_process_group(backend='nccl', init_method='env://', timeout=datetime.timedelta(seconds=5400))
07/20 17:37:28 - mmengine - INFO - Epoch(train) [3][2460/4634] base_lr: 1.2500e-04 lr: 6.3266e-05 eta: 15:32:14 time: 0.1281 data_time: 0.0223 memory: 654 loss: 4.5739 loss_cls: 1.0988 loss_state: 1.4604 loss_bbox: 0.6126 loss_dfl: 1.4021
Traceback (most recent call last):
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 990, in _try_get_data
data = self._data_queue.get(timeout=timeout)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/queue.py", line 175, in get
while not self._qsize():
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/queue.py", line 209, in _qsize
return len(self.queue)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 8104) is killed by signal: Killed.The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/lsw/HL/othercode/mi/tools/train.py", line 125, in <module>
main()
File "/home/lsw/HL/othercode/mi/tools/train.py", line 121, in main
runner.train()
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1746, in train
model = self.train_loop.run() # type: ignore
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/runner/loops.py", line 96, in run
self.run_epoch()
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/runner/loops.py", line 111, in run_epoch
for idx, data_batch in enumerate(self.dataloader):
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
data = self._next_data()
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1186, in _next_data
idx, data = self._get_data()
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1142, in _get_data
success, data = self._try_get_data()
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1003, in _try_get_data
raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e
RuntimeError: DataLoader worker (pid(s) 8104) exited unexpectedly
07/20 21:31:35 - mmengine - INFO - Epoch(train) [1][ 260/4634] base_lr: 1.2500e-04 lr: 1.3973e-06 eta: 14:31:26 time: 0.1365 data_time: 0.0374 memory: 715 loss: 7.9384 loss_cls: 2.2571 loss_state: 3.1435 loss_bbox: 0.7062 loss_dfl: 1.8315
Traceback (most recent call last):
File "/home/lsw/HL/othercode/mi/tools/train.py", line 125, in <module>
main()
File "/home/lsw/HL/othercode/mi/tools/train.py", line 121, in main
runner.train()
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1746, in train
model = self.train_loop.run() # type: ignore
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/runner/loops.py", line 96, in run
self.run_epoch()
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/runner/loops.py", line 112, in run_epoch
self.run_iter(idx, data_batch)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/runner/loops.py", line 128, in run_iter
outputs = self.runner.model.train_step(
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py", line 113, in train_step
data = self.data_preprocessor(data, True)
File "/home/lsw/miniconda3/envs/mi/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/lsw/HL/othercode/mi/mmyolo/models/data_preprocessors/data_preprocessor.py", line 163, in forward
_input = _input[[2, 1, 0], ...]
RuntimeError: CUDA out of memory. Tried to allocate 5.77 GiB (GPU 0; 11.91 GiB total capacity; 5.89 GiB already allocated; 4.86 GiB free; 5.98 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
代码
_backend_args = None _multiscale_resize_transforms = [ dict( transforms=[ dict(scale=( 640, 640, ), type='YOLOv5KeepRatioResize'), dict( allow_scale_up=False, pad_val=dict(img=114), scale=( 640, 640, ), type='LetterResize'), ], type='Compose'), dict( transforms=[ dict(scale=( 320, 320, ), type='YOLOv5KeepRatioResize'), dict( allow_scale_up=False, pad_val=dict(img=114), scale=( 320, 320, ), type='LetterResize'), ], type='Compose'), dict( transforms=[ dict(scale=( 960, 960, ), type='YOLOv5KeepRatioResize'), dict( allow_scale_up=False, pad_val=dict(img=114), scale=( 960, 960, ), type='LetterResize'), ], type='Compose'), ] backend_args = None # base_lr = 0.001 base_lr = 0.000125 custom_hooks = [ dict( ema_type='ExpMomentumEMA', momentum=0.0002, priority=49, strict_load=False, type='EMAHook', update_buffers=True), ] data_root = 'data/coco/' dataset_type = 'YOLOv5CocoDataset' deepen_factor = 0.33 default_hooks = dict( checkpoint=dict( interval=5, max_keep_ckpts=16, save_best='auto', type='CheckpointHook'), logger=dict(interval=10, type='LoggerHook'), param_scheduler=dict( min_lr_ratio=0.0, start_factor=0.0, total_epochs=96, type='PPYOLOEParamSchedulerHook', warmup_epochs=5, warmup_min_iter=1000), sampler_seed=dict(type='DistSamplerSeedHook'), timer=dict(type='IterTimerHook'), visualization=dict(type='mmdet.DetVisualizationHook')) default_scope = 'mmyolo' env_cfg = dict( cudnn_benchmark=False, dist_cfg=dict(backend='nccl'), mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0)) img_scale = ( 640, 640, ) img_scales = [ ( 640, 640, ), ( 320, 320, ), ( 960, 960, ), ] launcher = 'none' load_from = '/home/lsw/HL/othercode/mi/checkpoints/ppyoloe_plus_s_obj365_pretrained-bcfe8478.pth' log_level = 'INFO' log_processor = dict(by_epoch=True, type='LogProcessor', window_size=50) max_epochs = 80 metainfo = dict( classes=( 'humanchild', 'humanpregnant_woman', 'humanold', 'humanhandicapped', ), palette=[ ( 20, 220, 60, ), ]) model = dict( backbone=dict( act_cfg=dict(inplace=True, type='SiLU'), attention_cfg=dict( act_cfg=dict(type='HSigmoid'), type='EffectiveSELayer'), block_cfg=dict( shortcut=True, type='PPYOLOEBasicBlock', use_alpha=True), deepen_factor=0.33, norm_cfg=dict(eps=1e-05, momentum=0.1, type='BN'), type='PPYOLOECSPResNet', use_large_stem=True, widen_factor=0.5), bbox_head=dict( bbox_coder=dict(type='DistancePointBBoxCoder'), head_module=dict( act_cfg=dict(inplace=True, type='SiLU'), featmap_strides=[ 8, 16, 32, ], in_channels=[ 192, 384, 768, ], norm_cfg=dict(eps=1e-05, momentum=0.1, type='BN'), num_base_priors=1, num_classes=4, num_state=4, reg_max=16, type='PPYOLOEHeadModule', widen_factor=0.5), loss_bbox=dict( bbox_format='xyxy', iou_mode='giou', loss_weight=2.5, reduction='mean', return_iou=False, type='IoULoss'), loss_cls=dict( alpha=0.75, gamma=2.0, iou_weighted=True, loss_weight=1.0, reduction='sum', type='mmdet.VarifocalLoss', use_sigmoid=True), loss_dfl=dict( loss_weight=0.125, reduction='mean', type='mmdet.DistributionFocalLoss'), loss_state=dict( alpha=0.75, gamma=2.0, iou_weighted=True, loss_weight=1.0, reduction='sum', type='mmdet.VarifocalLoss', use_sigmoid=True), prior_generator=dict( offset=0.5, strides=[ 8, 16, 32, ], type='mmdet.MlvlPointGenerator'), type='PPYOLOEHead'), data_preprocessor=dict( batch_augments=[ dict( interval=1, keep_ratio=False, random_interp=True, random_size_range=( 320, 800, ), size_divisor=32, type='PPYOLOEBatchRandomResize'), ], bgr_to_rgb=True, mean=[ 0.0, 0.0, 0.0, ], pad_size_divisor=32, std=[ 255.0, 255.0, 255.0, ], type='PPYOLOEDetDataPreprocessor'), neck=dict( act_cfg=dict(inplace=True, type='SiLU'), block_cfg=dict( shortcut=False, type='PPYOLOEBasicBlock', use_alpha=False), deepen_factor=0.33, drop_block_cfg=None, in_channels=[ 256, 512, 1024, ], norm_cfg=dict(eps=1e-05, momentum=0.1, type='BN'), num_blocks_per_layer=3, num_csplayer=1, out_channels=[ 192, 384, 768, ], type='PPYOLOECSPPAFPN', use_spp=True, widen_factor=0.5), test_cfg=dict( max_per_img=300, multi_label=True, nms=dict(iou_threshold=0.7, type='nms'), nms_pre=1000, score_thr=0.01), train_cfg=dict( assigner=dict( alpha=1, beta=6, eps=1e-09, num_classes=4, topk=13, type='mi_BatchTaskAlignedAssigner'), initial_assigner=dict( iou_calculator=dict(type='mmdet.BboxOverlaps2D'), num_classes=4, topk=9, type='BatchATSSAssigner'), initial_epoch=30), type='YOLODetector') num_classes = 4 num_state = 4 optim_wrapper = dict( optimizer=dict( lr=0.000125, momentum=0.9, nesterov=False, type='SGD', weight_decay=0.0005), paramwise_cfg=dict(norm_decay_mult=0.0), type='OptimWrapper') param_scheduler = None persistent_workers = True resume = False save_epoch_intervals = 80 strides = [ 8, 16, 32, ] test_cfg = dict(type='TestLoop') test_dataloader = dict( batch_size=1, dataset=dict( ann_file='annotations/instances_val2017.json', data_prefix=dict(img='images/val2017/'), data_root='data/coco/', filter_cfg=dict(filter_empty_gt=True, min_size=0), metainfo=dict( classes=( 'humanchild', 'humanpregnant_woman', 'humanold', 'humanhandicapped', ), palette=[ ( 20, 220, 60, ), ]), pipeline=[ dict(backend_args=None, type='LoadImageFromFile'), dict( height=640, interpolation='bicubic', keep_ratio=False, type='mmdet.FixShapeResize', width=640), dict(_scope_='mmdet', type='LoadAnnotations', with_bbox=True), dict( meta_keys=( 'img_id', 'img_path', 'ori_shape', 'img_shape', 'scale_factor', ), type='mmdet.PackDetInputs'), ], test_mode=True, type='YOLOv5CocoDataset'), drop_last=False, num_workers=1, persistent_workers=True, pin_memory=True, sampler=dict(shuffle=False, type='DefaultSampler')) test_evaluator = dict( ann_file='data/coco/annotations/instances_val2017.json', metric='bbox', proposal_nums=( 100, 1, 10, ), type='mmdet.CocoMetric') test_pipeline = [ dict(backend_args=None, type='LoadImageFromFile'), dict( height=640, interpolation='bicubic', keep_ratio=False, type='mmdet.FixShapeResize', width=640), dict(_scope_='mmdet', type='LoadAnnotations', with_bbox=True), dict( meta_keys=( 'img_id', 'img_path', 'ori_shape', 'img_shape', 'scale_factor', ), type='mmdet.PackDetInputs'), ] train_batch_size_per_gpu = 1 train_cfg = dict(max_epochs=80, type='EpochBasedTrainLoop', val_interval=80) train_dataloader = dict( batch_size=1, collate_fn=dict(type='yolov5_collate', use_ms_training=True), dataset=dict( ann_file='annotations/instances_train2017.json', data_prefix=dict(img='images/train2017/'), data_root='data/coco/', filter_cfg=dict(filter_empty_gt=True, min_size=0), metainfo=dict( classes=( 'humanchild', 'humanpregnant_woman', 'humanold', 'humanhandicapped', ), palette=[ ( 20, 220, 60, ), ]), pipeline=[ dict(backend_args=None, type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='PPYOLOERandomDistort'), dict(mean=( 103.53, 116.28, 123.675, ), type='mmdet.Expand'), dict(type='PPYOLOERandomCrop'), dict(prob=0.5, type='mmdet.RandomFlip'), dict( meta_keys=( 'img_id', 'img_path', 'ori_shape', 'img_shape', 'flip', 'flip_direction', ), type='mmdet.PackDetInputs'), ], type='YOLOv5CocoDataset'), num_workers=1, persistent_workers=True, pin_memory=True, sampler=dict(shuffle=True, type='DefaultSampler')) train_num_workers = 1 train_pipeline = [ dict(backend_args=None, type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='PPYOLOERandomDistort'), dict(mean=( 103.53, 116.28, 123.675, ), type='mmdet.Expand'), dict(type='PPYOLOERandomCrop'), dict(prob=0.5, type='mmdet.RandomFlip'), dict( meta_keys=( 'img_id', 'img_path', 'ori_shape', 'img_shape', 'flip', 'flip_direction', ), type='mmdet.PackDetInputs'), ] tta_model = dict( tta_cfg=dict(max_per_img=300, nms=dict(iou_threshold=0.65, type='nms')), type='mmdet.DetTTAModel') tta_pipeline = [ dict(backend_args=None, type='LoadImageFromFile'), dict( transforms=[ [ dict( transforms=[ dict(scale=( 640, 640, ), type='YOLOv5KeepRatioResize'), dict( allow_scale_up=False, pad_val=dict(img=114), scale=( 640, 640, ), type='LetterResize'), ], type='Compose'), dict( transforms=[ dict(scale=( 320, 320, ), type='YOLOv5KeepRatioResize'), dict( allow_scale_up=False, pad_val=dict(img=114), scale=( 320, 320, ), type='LetterResize'), ], type='Compose'), dict( transforms=[ dict(scale=( 960, 960, ), type='YOLOv5KeepRatioResize'), dict( allow_scale_up=False, pad_val=dict(img=114), scale=( 960, 960, ), type='LetterResize'), ], type='Compose'), ], [ dict(prob=1.0, type='mmdet.RandomFlip'), dict(prob=0.0, type='mmdet.RandomFlip'), ], [ dict(type='mmdet.LoadAnnotations', with_bbox=True), ], [ dict( meta_keys=( 'img_id', 'img_path', 'ori_shape', 'img_shape', 'scale_factor', 'pad_param', 'flip', 'flip_direction', ), type='mmdet.PackDetInputs'), ], ], type='TestTimeAug'), ] val_batch_size_per_gpu = 1 val_cfg = dict(type='ValLoop') val_dataloader = dict( batch_size=1, dataset=dict( ann_file='annotations/instances_val2017.json', data_prefix=dict(img='images/val2017/'), data_root='data/coco/', filter_cfg=dict(filter_empty_gt=True, min_size=0), metainfo=dict( classes=( 'humanchild', 'humanpregnant_woman', 'humanold', 'humanhandicapped', ), palette=[ ( 20, 220, 60, ), ]), pipeline=[ dict(backend_args=None, type='LoadImageFromFile'), dict( height=640, interpolation='bicubic', keep_ratio=False, type='mmdet.FixShapeResize', width=640), dict(_scope_='mmdet', type='LoadAnnotations', with_bbox=True), dict( meta_keys=( 'img_id', 'img_path', 'ori_shape', 'img_shape', 'scale_factor', ), type='mmdet.PackDetInputs'), ], test_mode=True, type='YOLOv5CocoDataset'), drop_last=False, num_workers=1, persistent_workers=True, pin_memory=True, sampler=dict(shuffle=False, type='DefaultSampler')) val_evaluator = dict( ann_file='data/coco/annotations/instances_val2017.json', metric='bbox', proposal_nums=( 100, 1, 10, ), type='mmdet.CocoMetric') val_num_workers = 1 vis_backends = [ dict(type='LocalVisBackend'), ] visualizer = dict( name='visualizer', type='mmdet.DetLocalVisualizer', vis_backends=[ dict(type='LocalVisBackend'), ]) widen_factor = 0.5 work_dir = 'HL'
原文地址:https://blog.csdn.net/tingwods/article/details/140571659
免责声明:本站文章内容转载自网络资源,如本站内容侵犯了原著者的合法权益,可联系本站删除。更多内容请关注自学内容网(zxcms.com)!