Research

Paper

TESTING March 02, 2026

ORGAN: Object-Centric Representation Learning using Cycle Consistent Generative Adversarial Networks

Authors

Joël Küchler, Ellen van Maren, Vaiva Vasiliauskaitė, Katarina Vulić, Reza Abbasi-Asl, Stephan J. Ihle

Abstract

Although data generation is often straightforward, extracting information from data is more difficult. Object-centric representation learning can extract information from images in an unsupervised manner. It does so by segmenting an image into its subcomponents: the objects. Each object is then represented in a low-dimensional latent space that can be used for downstream processing. Object-centric representation learning is dominated by autoencoder architectures (AEs). Here, we present ORGAN, a novel approach for object-centric representation learning, which is based on cycle-consistent Generative Adversarial Networks instead. We show that it performs similarly to other state-of-the-art approaches on synthetic datasets, while at the same time being the only approach tested here capable of handling more challenging real-world datasets with many objects and low visual contrast. Complementing these results, ORGAN creates expressive latent space representations that allow for object manipulation. Finally, we show that ORGAN scales well both with respect to the number of objects and the size of the images, giving it a unique edge over current state-of-the-art approaches.

Metadata

arXiv ID: 2603.02063
Provider: ARXIV
Primary Category: cs.CV
Published: 2026-03-02
Fetched: 2026-03-03 04:34

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2603.02063v1</id>\n    <title>ORGAN: Object-Centric Representation Learning using Cycle Consistent Generative Adversarial Networks</title>\n    <updated>2026-03-02T16:51:52Z</updated>\n    <link href='https://arxiv.org/abs/2603.02063v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2603.02063v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>Although data generation is often straightforward, extracting information from data is more difficult. Object-centric representation learning can extract information from images in an unsupervised manner. It does so by segmenting an image into its subcomponents: the objects. Each object is then represented in a low-dimensional latent space that can be used for downstream processing. Object-centric representation learning is dominated by autoencoder architectures (AEs). Here, we present ORGAN, a novel approach for object-centric representation learning, which is based on cycle-consistent Generative Adversarial Networks instead. We show that it performs similarly to other state-of-the-art approaches on synthetic datasets, while at the same time being the only approach tested here capable of handling more challenging real-world datasets with many objects and low visual contrast. Complementing these results, ORGAN creates expressive latent space representations that allow for object manipulation. Finally, we show that ORGAN scales well both with respect to the number of objects and the size of the images, giving it a unique edge over current state-of-the-art approaches.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.CV'/>\n    <published>2026-03-02T16:51:52Z</published>\n    <arxiv:comment>GitHub: https://github.com/Hullimulli/ORGAN</arxiv:comment>\n    <arxiv:primary_category term='cs.CV'/>\n    <author>\n      <name>Joël Küchler</name>\n    </author>\n    <author>\n      <name>Ellen van Maren</name>\n    </author>\n    <author>\n      <name>Vaiva Vasiliauskaitė</name>\n    </author>\n    <author>\n      <name>Katarina Vulić</name>\n    </author>\n    <author>\n      <name>Reza Abbasi-Asl</name>\n    </author>\n    <author>\n      <name>Stephan J. Ihle</name>\n    </author>\n  </entry>"
}