CrowdSplat: Exploring Gaussian Splatting For Crowd Rendering

Trinity College Dublin
* Equal contribution

Abstract

We present CrowdSplat, a novel approach that leverages 3D Gaussian Splatting for real-time, high-quality crowd rendering. Our method utilizes 3D Gaussian functions to represent animated human characters in diverse poses and outfits, which are extracted from monocular videos. We integrate Level of Detail (LoD) rendering to optimize computational efficiency and quality. The CrowdSplat framework consists of two stages: (1) avatar reconstruction and (2) crowd synthesis. The framework is also optimized for GPU memory usage to enhance scalability. Quantitative and qualitative evaluations show that CrowdSplat achieves good levels of rendering quality, memory efficiency, and computational performance. Through these experiments, we demonstrate that CrowdSplat is a viable solution for dynamic, realistic crowd simulation in real-time applications. Additionally, in our follow-up study, we investigate the perceived quality of 3D Gaussian Splatting for crowd rendering and look at the main factors that affect viewer perception.

Method Overview

Pipeline Overview

The first stage combines the estimated SMPL body poses and images with a UV positional map. This process fits the 3D Gaussian attributes for each sampled point on the SMPL mesh template, reconstructing a Gaussian avatar template. The second stage uses Linear Blend Skinning (LBS) to animate multiple crowd characters, using an LoD technique for memory and rendering speed optimization. We reconstruct 14 avatar templates in the first stage and randomly duplicate these templates to 3,500 characters in the second stage.

Quantitative Results

Rendering Speed

Rendering speed (in FPS) for different numbers of Gaussians and characters.

GPU Memory Cost

GPU memory cost (in MiB) for different numbers of Gaussians and characters. † is the memory optimization by the CUDA program.

User Study

We conducted a two-alternative forced choice (2AFC) experiment to assess the perceived quality of 3D Gaussian avatars. Participants compared pairs of animated avatars and selected the one that appeared more detailed. We explored three key factors: motion, level of detail (LOD) based on the number of Gaussians, and avatar height in pixels (representing viewing distance). These findings help optimize LOD strategies for Gaussian-based crowd rendering, ensuring efficient rendering while maintaining high visual fidelity in real-time applications.

More Results

BibTeX


  @inproceedings{sun2024crowdsplat,
    title     = {{CrowdSplat: Exploring Gaussian Splatting for Crowd Rendering}},
    author    = {Sun, Xiaohan and Xu, Yinghan and Dingliana, John and O’Sullivan, Carol},
    booktitle = {IET Conference Proceedings CP887},
    volume    = {2024},
    number    = {10},
    pages     = {311--314},
    year      = {2024},
    organization = {IET}
  }

  @article{sun2025evaluating,
    title     = {{Evaluating CrowdSplat: Perceived Level of Detail for Gaussian Crowds}},
    author    = {Sun, Xiaohan and Xu, Yinghan and Dingliana, John and O'Sullivan, Carol},
    journal   = {arXiv preprint arXiv:2501.17085},
    year      = {2025}
  }