DETR 논문리뷰

용어정리

non-maximum suppression
예측된 bounding box중에 정확한 bounding box를 선택하도록 하는 기법
IoU (Intersection over Union)
object detector의 metric으로 ground-truth bounding box와 predicted bounding box의 전체 총 면적 대비 겹치는 면적.
bipartite matching
최대한 많은 매칭을 만들어 주는 알고리즘 (bounding box and image patch)
panoptic segmentation즉 stuff와 thing의 구별도 포함한다.
sementic segmentation (픽셀별 레이블링) + instance segmentation (물체인식, segment).
suggorate task
본래의 블랙박스 task를 알 수 없어 잘 알 수 있는 대체품을 이용해 분석
Modern detectors
- RPN (Region Proposal Network)
  1. 미리 여러형태의 anchor box를 준비해두고
  2. sliding window방식으로 객체 검출
  3. 한 객체에 대해서 수많은 검출이 발생하는데, 클래스별 확률이 가장 높은것만 취급
- FPN (Feature Pyramid Network)
  - feature map을 추출할때, residual connection을 이용해서 high frequency와 low frequency의 정보를 모두 추출하며, 연산 overload를 줄이는 방법

Abstract

아래와 같이 직접 디자인해야하는 요소 제거
- non-maximum suppression
- anchor generation
모델이 단순하고 specialized library를 필요로 하지 않는다.
손쉽게 일반화 할 수 있어 panoptic segmentation 생성 가능

Introduction

detector들은 proposal, anchors, window centor 를 통해 처리
겹치는 box들은 다음을 이용해서 처리
- 후처리
- anchor set 디자인
- anchor align
이 경우 리소스가 많이 들기에 과정을 간소화하려함 → direct set prediction

model overview

학습과정을 direct set prediction problem 으로 다룸

transformer의 encoder-decoder 구조 (multi attention의 이점을 봄)
bipartite matching set los function 을 통해 end2end 학습
- 한번에 모든 object 예측
- 전처리 요소를 버리며 pipeline 간소화

features of DETR

bipartite의 pair간 매칭 loss와 non autoregressive parallel decoding을 포함한 transformer를 결합
small 보단 large objects에 대한 성능이 좋다

Related work

bipartite matching losses for set prediction
transformers and parallel decoding
object detection methods
set based loss
recurrent detectors

DETR Model

Needs
- ground truth boxes와 predicted boxes간 ‘unique matching’을 할당하는 set prediction loss
- set of objects를 예측하고, 관계를 모델링 하는 architecture

Object detection set prediction loss

한번에 N개의 고정된 개수의 예측 반환 (N은 충분히 커야함)
ground truth과 prediction간의 bipartite matching생성 후 object단위에서 loss 최적화

pair-wise matching cost

$\hat{\sigma}$는 optimal assignment
$L_{match}$는 ground truth $y_i$와 index $\sigma(i)$를 갖는 prediction $\hat{y}_{\sigma(i)}$ 사이의 pair-wise matching cost

Match Loss

class 에 대한 loss와 bounding box에 대한 loss 함께 고려
index $\sigma(i)$ 예측을 위해 class 의 예측 확률을 $\hat{p}{\hat{\sigma}(i)}(c_i)$, predicted box를 $\hat{b}{\sigma(i)}$

hungarian loss

아무 것도 아닌 none 일때의 class imbalance 개선을 위해 log-probability term을 factor 10정도로 down-weight
object와 none 사이의 cost는 prediction에 의존하지 않음 → cost가 상수
log-probabilites대신 $\hat{p}{\hat{\sigma}(i)}(c_i)$를 사용하여 class prediction term을 $L{box}(*, *)$로 만들어 성능향상시킴

bounding box loss

modern detactor가 최초 예측에 대한 gradient로 수행하는 것과 달리 directily box prediction 수행 → loss relative scaling issue
L1 loss는 relative error가 비슷해도 box 크기에 따라 다른 scale → L1 loss의 linear combination, generalized IoU loss $L_{iou}(*, *)$사용

DETR architecture

compact feature representation을 추출하는 CNN backbone
- activation map 생성
encoder-decoder transformer
- encoder
  - 1 x 1 convolution이 activation map을 저차원으로 낮춘다.
  - sequence로 인풋을 줘야하기에 1차원으로 낮춘다.
  - 위치정보가 없으므로 fixed positionla encodings 추가
- decoder
  - N embeddings of size d 를 transform
  - 기존의 transformer와 달리, 각 decoder layer에서 병렬적으로 N object decoding
  - N input embedding을 달리 주기위해 input은 object querys(positional encodings)학습
detection 결과를 반환하는 feed forward network
- hidden dimension d layer, linear projection layer로 구성
  - FFN : normalized centor coordinates, height, width 예측
  - linear layer : softmax를 이용해 class label 예측
Auxilary(보조) decoding losses
- 매 decoder layer뒤에 prediction FFNs와 Hungarian loss추가
- 모든 prediction FFNsdms parameter 공유
- 정규화를 위해 shared layer-norm 사용

'논문리뷰' 카테고리의 다른 글

StyleGAN-V 논문리뷰 (0)	2022.03.19
I-Bert 논문리뷰 (0)	2022.03.19

Life is long, We're young

DETR 논문리뷰

용어정리

Abstract

Introduction

model overview

features of DETR

Related work

DETR Model

Object detection set prediction loss

pair-wise matching cost

Match Loss

hungarian loss

bounding box loss

DETR architecture

'논문리뷰' 카테고리의 다른 글

댓글

티스토리툴바

DETR 논문리뷰

용어정리

Abstract

Introduction

model overview

features of DETR

Related work

DETR Model

Object detection set prediction loss

pair-wise matching cost

Match Loss

hungarian loss

bounding box loss

DETR architecture

'논문리뷰' 카테고리의 다른 글

관련글

댓글

티스토리툴바