GeoAware3D — Geometry-Aware 3D Semantic Features
Published:
Resources:
Summary. We propose GeoAware3D, a zero-shot, class-agnostic method to decorate meshes/point clouds with geometry-aware semantic features. We modify the structure of DIFF3F to (i) render multi-view images, (ii) add texture via ControlNet-guided diffusion, (iii) fuse Stable Diffusion + DINO features, and (iv) unproject per-pixel descriptors back to 3D, aggregating with (k)-NN mean to obtain vertex/point-wise features. No training or extra data required.
Highlights
- DIFF3F-derived, structurally modified pipeline (project → decorate → fuse → unproject → aggregate).
- No training / no extra data; works on untextured shapes.
- Projective analysis: 3D → 2D renders → fused features → unprojection to 3D.
- Two correspondence modes: closest-vertex or direct point-to-point.
- Geometry-aware fusion improves robustness to pose and symmetries.
Results (SHREC’19, humans)
- Accuracy: 23.42% vs. DIFF3F 26.41% and SE-ORNet 21.41%.
- Runtime: ~1.02 min/mesh (DIFF3F ~4.42 min/mesh).
- Ablations:
- 2D-only corr.: 16.12%
- Standard SD+DINO: 17.81%
- Hyperfeatures: 18.54%
- GeoAware3D: 23.35% (32 views) → 23.42% (64 views)
Method at a Glance
- Render (k) views (uniform azimuths, fixed elevation).
- Texture each view using ControlNet (prompted high-def photo realism).
- Fuse SD + DINO features (geometry-aware aggregation).
- Unproject per-pixel features to 3D with depth + intrinsics (K); build a point cloud of descriptors.
- Aggregate to vertices via (k)-NN mean; compute cosine-similarity correspondences (vertex- or point-level).
