公开数据集

WILDTRACK：用于密集无脚本行人检测的多摄像机高清数据集

57.9G

1189 浏览

2 喜欢

0 次下载

0 条讨论

Data Cleaning 2D Box

“WILDTRACK”数据集的挑战性和现实性设置将多摄像机检测和跟踪方法引入了野外。它满足了大规模多摄像头步行者数据集的深度学习......

数据介绍
文件预览
相关论文
Code
分享讨论(0)
使用声明

启动Notebook开发

数据结构 ? 57.9G

README.md

“WILDTRACK”数据集的挑战性和现实性设置将多摄像机检测和跟踪方法引入了野外。它满足了大规模多摄像头步行者数据集的深度学习方法的需要，其中摄像头的视野在很大程度上重叠。它被当前的高科技硬件收购，提供高清分辨率的数据。此外，其高精度联合校准和同步应允许开发新的算法，超出当前可用数据集的可能范围。

数据采集发生在瑞士苏黎世ETH主楼前，当时天气良好。这些序列的分辨率为1920×1080像素，以每秒60帧的速度拍摄。

The challenging and realistic setup of the ‘WILDTRACK‘ dataset brings multi-camera detection and tracking methods into the wild.

It meets the need of the deep learning methods for a large-scale multi-camera dataset of walking pedestrians, where the cameras’ fields of view in large part overlap. Being acquired by current high tech hardware it provides HD resolution data. Further, its high precision joint calibration and synchronization shall allow for development of new algorithms that go beyond what is possible with currently available data-sets.

The data acquisition took place in front of the main building of ETH Zurich, Switzerland, during nice weather conditions. The sequences are of resolution 1920×1080 pixels, shot at 60 frames per second.

Description of available files

Synchronized frames extracted with a frame rate of 10 fps, 1920×1080 resolution, and which are post-processed to remove the distortion; Calibration files which use the Pinhole camera model, compatible with the projection functions provided in the OpenCV library. Both the extrinsic and the intrinsic calibrations are available; The ground-truth annotations in a ‘json’ file format (please see separate section bellow); For ease in usage for methods focusing on classification, we also provide a file we refer to as ‘positions’ file in ‘json’ file format. For details please refer to the section bellow. Please check for an update of this site, which shell extend the download list with:

Full videos;

Corresponding points annotations which may be used for camera calibration algorithms; A second part of this dataset which albeit not being annotated, can be used for unsupervised methods.

Positions file

The ‘positions file’ allows for omitting the work with calibration files and focusing for instance on classification, while making use of the fact that the cameras are static. It consists of information about where exactly a given set of particular volumes of space project to in all of the views. The height of each volume space corresponds to the one of an average person’s height.

We discretize the ground surface as a regular grid. The 3D space occupied if a person is standing at a particular position is modelled by a cylinder positioned centrally on the grid point. Each cylinder projects into each of the separate 2D views as a rectangle whose position in the view is given in pixel coordinates.

Using a 480×1440 grid – totalling into 691200 positions – and the provided camera calibration files, we yield such file which is available for download. Each position is assigned an ID using 0-based enumeration ([0, 691199]). The views’ ordering numbers in this file also follow such enumeration, i.e. they range between 0 and 6 inclusively. The positions which are not visible in a given view are assigned coordinates of -1.

Annotations

Full ground truth annotations are provided for 400 frames using a frame rate of 2fps. On average, there are 20 persons on each frame. Thus, our dataset provides approximately 400x20x7=56,000 single-view bounding boxes. By interpolating, the annotations’ size can be further increased. This annotations were generated through workers hired on Amazon Mechanical Turk.

Note that the annotations roughly correspond to the coordinates of the above-elaborated position file and thus include the ID of the annotated position which is estimated to be occupied by the specific target. These position IDs are in accordance with the provided positions file.

Acknowledgment

This work was supported by the Swiss National Science Foundation, under the grant CRSII2-147693 ”WILDTRACK”.

Publication

WILDTRACK: A Multi-camera HD Dataset for Dense Unscripted Pedestrian Detection T. Chavdarova; P. Baqué; A. Maksai; S. Bouquet; C. Jose et al. Computer Vision and Pattern Recognition, 2018, 10.1109/CVPR.2018.00528.

URL: https://www.epfl.ch/labs/cvlab/data/data-wildtrack/
License: No license specified, the work may be protected by copyright.

Bibtex:

@article{,
title= {The WILDTRACK Seven-Camera HD Dataset},
keywords= {},
author= {},
abstract= {The challenging and realistic setup of the ‘WILDTRACK‘ dataset brings multi-camera detection and tracking methods into the wild.

It meets the need of the deep learning methods for a large-scale multi-camera dataset of walking pedestrians, where the cameras’ fields of view in large part overlap. Being acquired by current high tech hardware it provides HD resolution data. Further, its high precision joint calibration and synchronization shall allow for development of new algorithms that go beyond what is possible with currently available data-sets.

The data acquisition took place in front of the main building of ETH Zurich, Switzerland, during nice weather conditions. The sequences are of resolution 1920×1080 pixels, shot at 60 frames per second.

https://i.imgur.com/Hzamclh.jpg

## Description of available files

Synchronized frames extracted with a frame rate of 10 fps, 1920×1080 resolution, and which are post-processed to remove the distortion;
Calibration files which use the Pinhole camera model, compatible with the projection functions provided in the OpenCV library. Both the extrinsic and the intrinsic calibrations are available;
The ground-truth annotations in a ‘json’ file format (please see separate section bellow);
For ease in usage for methods focusing on classification, we also provide a file we refer to as ‘positions’ file in ‘json’ file format. For details please refer to the section bellow.
Please check for an update of this site, which shell extend the download list with:

## Full videos;
Corresponding points annotations which may be used for camera calibration algorithms;
A second part of this dataset which albeit not being annotated, can be used for unsupervised methods.

## Positions file
The ‘positions file’ allows for omitting the work with calibration files and focusing for instance on classification, while making use of the fact that the cameras are static. It consists of information about where exactly a given set of particular volumes of space project to in all of the views. The height of each volume space corresponds to the one of an average person’s height.

We discretize the ground surface as a regular grid. The 3D space occupied if a person is standing at a particular position is modelled by a cylinder positioned centrally on the grid point. Each cylinder projects into each of the separate 2D views as a rectangle whose position in the view is given in pixel coordinates.

Using a 480×1440 grid – totalling into 691200 positions – and the provided camera calibration files, we yield such file which is available for download. Each position is assigned an ID using 0-based enumeration ([0, 691199]). The views’ ordering numbers in this file also follow such enumeration, i.e. they range between 0 and 6 inclusively. The positions which are not visible in a given view are assigned coordinates of -1.

## Annotations
Full ground truth annotations are provided for 400 frames using a frame rate of 2fps. On average, there are 20 persons on each frame. Thus, our dataset provides approximately 400x20x7=56,000 single-view bounding boxes. By interpolating, the annotations’ size can be further increased. This annotations were generated through workers hired on Amazon Mechanical Turk.

Note that the annotations roughly correspond to the coordinates of the above-elaborated position file and thus include the ID of the annotated position which is estimated to be occupied by the specific target. These position IDs are in accordance with the provided positions file.

## Acknowledgment

This work was supported by the Swiss National Science Foundation, under the grant CRSII2-147693 ”WILDTRACK”.

## Publication
WILDTRACK: A Multi-camera HD Dataset for Dense Unscripted Pedestrian Detection
T. Chavdarova; P. Baqué; A. Maksai; S. Bouquet; C. Jose et al.
Computer Vision and Pattern Recognition, 2018, 10.1109/CVPR.2018.00528.

},
terms= {},
license= {},
superseded= {},
url= {https://www.epfl.ch/labs/cvlab/data/data-wildtrack/}
}

暂无相关内容。

分享你的想法

去分享你的想法~~

全部内容

欢迎交流分享

开始分享您的观点和意见，和大家一起交流分享.

数据使用声明：

一、数据来源与展示说明：

1、该数据来自于互联网数据采集或服务商的提供，本平台为用户提供数据集的展示与浏览。
2、本平台仅作为数据集的基本信息展示、包括但不限于图像、文本、视频、音频等文件类型。
3、数据集基本信息来自数据原地址或数据提供方提供的信息，如数据集描述中有描述差异，请以数据原地址或服务商原地址为准。

二、所有权说明：

1、本站中的所有数据集的版权都归属于原数据发布者或数据提供方所有。

三、数据转载说明：

1、如您需要转载本站数据，请保留原数据地址及相关版权声明。

四、侵权与处理说明：

1、如本站中的部分数据涉及侵权展示，请及时联系本站，我们会安排进行数据下线。

所需积分：

35 去赚积分？

1189浏览
0下载
2点赞
收藏
分享

Select Language

AI社区

今日排行

本月搜索

Dataset Category