Pxform — Feedforward 3D Editing Learns from Semantic-Part Transformation

Abstract

3D editing is a fundamental capability for scalable 3D content creation. While image editing has rapidly evolved toward large-scale feedforward generative paradigms, 3D AI generation remains dominated by training-free editing pipelines. A central challenge of feedforward 3D editing lies in the lack of high-quality paired supervision. Editable 3D assets require simultaneous preservation of geometry, multi-view consistency, structural coherence, and localized edit controllability. Existing 3D editing datasets often rely on independently generated assets, image-mediated reconstruction or narrow edit taxonomies, leading to inaccurate localization, weak preservation, blurred edit boundaries, and limited semantic consistency. In this work, we introduce a new perspective: scalable feedforward 3D editing should be learned from semantic-part transformations. Based on this insight, we propose Pxform, a high-quality 3D editing dataset with over 100K consistent before/after editing pairs across seven edit types. Instead of treating objects as unstructured shapes, our pipeline grounds edits directly in semantic 3D parts. Built upon Pxform, we further propose PartFlow, a feedforward 3D editing network that injects source-aware latent control into pretrained 3D generative priors. PartFlow introduces mask-aware velocity preservation and render-space consistency supervision to jointly improve edit fidelity and source preservation, while requiring no 3D edit mask during inference. Extensive experiments demonstrate that high-quality semantic-part supervision substantially improves scalable 3D editing, enabling PartFlow to achieve state-of-the-art performance on both geometric and appearance editing benchmarks.

Results Comparison

Comparison with Prior Methods

Qualitative comparison against feedforward 3D editing baselines. Each row is one edit instruction: the leftmost panel is the Source 3D asset, followed by the 2D Condition reference, then each baseline’s result and our PartFlow output (highlighted). Click “Load 3D”, then drag any 3D viewer in a row — all 3D panels in that row rotate together.

Edit addition “Add all the ceramic cups”

Source

Condition

PartFlow (Ours)

Rodin Gen-2.5

3DEditFormer

VoxHammer

Nano3D

Edit addition “Add the blue circular component”

Source

Condition

PartFlow (Ours)

Rodin Gen-2.5

3DEditFormer

VoxHammer

Nano3D

Edit addition “Add the donut box with the text ‘Donuts’”

Source

Condition

PartFlow (Ours)

Rodin Gen-2.5

3DEditFormer

VoxHammer

Nano3D

Edit deletion “Remove the mortar bowl and keep only the pestle”

Source

Condition

PartFlow (Ours)

Rodin Gen-2.5

3DEditFormer

VoxHammer

Nano3D

Edit deletion “Remove the two cabinet doors from the wardrobe facade”

Source

Condition

PartFlow (Ours)

Rodin Gen-2.5

3DEditFormer

VoxHammer

Nano3D

Edit modification “Replace the balloon envelope with a geometric faceted polyhedron shape”

Source

Condition

PartFlow (Ours)

Rodin Gen-2.5

3DEditFormer

VoxHammer

Nano3D

Edit modification “Replace the circular well base with a square stone foundation”

Source

Condition

PartFlow (Ours)

Rodin Gen-2.5

3DEditFormer

VoxHammer

Nano3D

Edit modification “Replace the human head with a smooth spherical orb”

Source

Condition

PartFlow (Ours)

Rodin Gen-2.5

3DEditFormer

VoxHammer

Nano3D

Edit scale “Shrink the Ferris wheel top component”

Source

Condition

PartFlow (Ours)

Rodin Gen-2.5

3DEditFormer

VoxHammer

Nano3D

Dataset Examples

Editing examples from the Pxform Dataset, grouped by edit type. Each card shows the Before 3D asset (left), the After result (right), and in the middle the 2D edited reference image with the edit instruction below it. Both 3D meshes are decoded from SLAT through the same pipeline. Click "Load 3D", then drag either viewer to rotate — the Before and After cameras stay in sync.

addition 4 examples

Before

Edit instruction“Add the face guard entirely”

After

Before

Edit instruction“Add the cylindrical donut container”

After

Before

Edit instruction“Add all metal bands and hoops to the barrel”

After

Before

Edit instruction“Add the octagonal roof”

After

deletion 4 examples

Before

Edit instruction“Remove the top hat component”

After

Before

Edit instruction“Remove the red and green component of the wooden structure”

After

Before

Edit instruction“Remove the red polka dot mug and the blue polka dot mug”

After

Before

Edit instruction“Remove the black and white walking cane”

After

modification 4 examples

Before

Edit instruction“Replace the helmet with a wide-brimmed hat”

After

Before

Edit instruction“Replace the carriage body with a sleek aerodynamic capsule”

After

Before

Edit instruction“Replace the classical upper torso with a futuristic armored chest plate”

After

Before

Edit instruction“Transform the castle roof section into a thatched roof”

After

scale 4 examples

Before

Edit instruction“Shrink the whipped cream topping to a small dollop”

After

Before

Edit instruction“Shrink the alien head to a smaller size”

After

Before

Edit instruction“Shrink the white ceramic cup”

After

Before

Edit instruction“Shrink the umbrella canopy”

After

color 4 examples

Before

Edit instruction“Repaint the balloon envelope in deep emerald green”

After

Before

Edit instruction“Repaint the dress in deep crimson red”

After

Before

Edit instruction“Repaint the landing pad matte charcoal black”

After

Before

Edit instruction“Repaint the skirt in a deep emerald green shade”

After

material 4 examples

Before

Edit instruction“Change the base material to polished brass”

After

Before

Edit instruction“Change the pedestal material to polished obsidian”

After

Before

Edit instruction“Make the alien head out of frosted glass”

After

Before

Edit instruction“Change the cello body to polished mahogany wood”

After

global 4 examples

Before

Edit instruction“Render the object as an ancient terracotta figurine”

After

Before

Edit instruction“Transform the entire scene into a steampunk brass-and-gears aesthetic”

After

Before

Edit instruction“Render the object in a stained-glass mosaic style”

After

Before

Edit instruction“Transform the object into a Ming dynasty blue-and-white porcelain style”

After

Method

Framework Overview

We learn scalable feedforward 3D editing from semantic-part transformations. Pxform grounds every edit in semantic 3D parts, yielding over 100K consistent before/after editing pairs across seven edit types. Built on this supervision, PartFlow injects source-aware latent control into pretrained 3D generative priors, and combines mask-aware velocity preservation with render-space consistency supervision to jointly improve edit fidelity and source preservation — while requiring no 3D edit mask at inference.

Fig. 1. The Pxform dataset construction pipeline — semantic-part-grounded 3D editing pairs.

Fig. 2. Overview of the PartFlow feedforward 3D editing framework.

Pxform Feedforward 3D Editing Learns from Semantic-Part Transformation

Pxform
Feedforward 3D Editing Learns from Semantic-Part Transformation