Post

MMDetection 톺아보기

MMDetection

(object detection tool box and benchmark)

  • MMDetection Paper : Here
  • Official code : Here

object detection tool box인 MMDetection과 MMDetection이 지원하는 프레임워크들의 benchmark를 알아보자

Frameworks

지원하는 프레임워크 KeyPoint

Single stage

input -> feature extraction -> detection(Localization, Classification ) -> output(multi class classification,bounding box regression)

Localization, Classification 을 동시에 해결

NameContentYear
SSDmulti scale feature map2015
RetinaNetFocal loss2017
GHMgradient harmonizing mechanism2019
FCOSfully convolutional ,anchor-free2019
FSAFfully convolutional ,anchor-free2019

Two stage

input -> region proposal(Localization) -> Classification -> output(multi class classification,bounding box regression)

Localization, Classification 을 순차적으로 해결

NameContentYear
Fast R-CNNRegion Proposal(RP) , ROI Pooling2015
Faster R-CNNRegion Proposal Network(RPN),Fast R-CNN2015
R-FCNfully convolutional, Faster R-CNN2016
Mask R-CNNBinary Mask ,RoI Align , Faster RCNN2017
Grid R-CNNgrid guided localization mechanism(bounding box regression imporved), RPN2018
Mask Scoring R-CNNmask IoU prediction, Mask R-CNN2019
Double-Head R-CNNconvolution head(localization) + fully connected head(classification)2019

Multi Stage

NameContentYear
Casecade R-CNNmulti-stage2017
Hybrid Task Cascademulti-stage , multi-branch , instance segmentation2019

General Modules and Methods

NameContentYear
Mixed Precision Traininghalf precision floating point (FP16) 2018
Soft NMSnew NMS2017
OHEMhard sampling2016
DCN deformable convolution, deformable RoI pooling2017
DCNv2deformable operators2018
ScratchDetscratch,random initialization2018
Train from Scratchscratch2018
M2Deteffective feature pyramids2018
GCNetglobal context block2019
Generalized Attentiongeneralized attention formulation2019
SyncBN,MegDetbatch normalization, synchronized 2017
GroupNormalizationgroup batch normalization2018
Weight Standardizationmicro-batch training2019
HRNethigh-resolution representations, backbone2019
Guided Anchoringnew anchoring, sparse and arbitrary-shaped anchors2019
Libra R-CNNframework, balanced learning2019

Architecture

Model Representations

  • Backbone : fully connected layer가 없는 resnet-50

  • Neck : feature map 수정/재구성 ex) FPN

  • DenseHead : AnchorHead / AnchorFreeHead(RPNHead, RetinaHead, FCOSHead)를 포함하고 feature map의 밀집된 위치에서 작동한다.

  • RoIExtractor : RoIPooling과 같은 연산을 사용해 ROIwise feature를 추출하는 부분이다. ex) SingleRoI

  • RoIHead : bounding box를 분류, 회귀, 마스크 예측

figure1

Training Pipeline

  • hooking : 함수 호출, 메시지, 이벤트 등을 중간에서 바꾸거나 가로채는 명령, 방법, 기술이나 행위를 말한다.

hooking mechanism을 가지고 있는 pipeline

figure2

Benchmark

Dataset

  • VOC
  • COCO

figure3

부록 : 사용하기

참조 : 깃허브

환경 설치하기

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
conda create -n open-mmlab python=3.7 -y
conda activate open-mmlab

conda install pytorch torchvision -c pytorch

git clone https://github.com/open-mmlab/mmdetection.git
cd mmdetection

pip install mmcv

python setup.py develop
# or "pip install -v -e ."

mkdir data
ln -s $COCO_ROOT data

데이터셋 준비하기

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
mmdetection
├── mmdet
├── tools
├── configs
├── data
│   ├── coco
│   │   ├── annotations
│   │   ├── train2017
│   │   ├── val2017
│   │   ├── test2017
│   ├── cityscapes
│   │   ├── annotations
│   │   ├── train
│   │   ├── val
│   ├── VOCdevkit
│   │   ├── VOC2007
│   │   ├── VOC2012

데이터셋을 다운로드하고 위와 같은 구조를 맞추어 주어야한다.

1
2
cd data/cityscapes/
mv train/*/* train/

모델 준비하기

다운로드 : https://github.com/open-mmlab/mmdetection/blob/master/docs/MODEL_ZOO.md

  • 실행하기

dataset demo

1
2
3
python tools/test.py configs/faster_rcnn_r50_fpn_1x.py \
    checkpoints/faster_rcnn_r50_fpn_1x_20181010-3d1b3351.pth \
    --show

bbox, mask AP

1
2
3
python tools/test.py configs/mask_rcnn_r50_fpn_1x.py \
    checkpoints/mask_rcnn_r50_fpn_1x_20181010-069fa190.pth \
    --out results.pkl --eval bbox segm

webcam demo

1
2
python demo/webcam_demo.py configs/faster_rcnn_r50_fpn_1x.py \
    checkpoints/faster_rcnn_r50_fpn_1x_20181010-3d1b3351.pth

이것저것 테스팅을 해볼수 있는 유용한 toolbox다. 고성능 API도 이용할 수 있기 때문에 사용이 편리하다.

This post is licensed under CC BY 4.0 by the author.