Method for obtaining rotation-invariant image representation by removing orientation features from autoencoder latent space

Bedratiuk, Anna

Method for obtaining rotation-invariant image representation by removing orientation features from autoencoder latent space

Файли

CSIT-2025-N2+(19)+112-122.pdf (622.25 KB)

Дата

2025

Автори

Bedratiuk, Anna

Видавець

Хмельницький національний університет

Анотація

In many computer vision tasks, accurate object recognition is complicated by arbitrary object orientations. Ensuring rotation invariance is critical for improving classification accuracy and reducing errors related to the varying placement of objects. This issue is particularly important in real-world environments, where object orientation is rarely controlled. The goal of this study is to develop a method that allows separating rotational features from the semantic essence of an object, while preserving high classification accuracy after removing orientation-related components. This approach enables the construction of models that remain effective under a wide range of input perspectives, thus improving robustness in practical applications. The proposed method is based on using a convolutional variational autoencoder trained on a dataset of images subjected to various rotation angles. Linear regression is then used to identify those latent components that correlate most strongly with the rotation parameter. These components are removed, and the remaining features are used for classification. Additionally, image reconstruction is performed from the reduced latent vector to visually validate rotation invariance and evaluate the preservation of object shape. Experiments on a synthetically rotated binarized digit dataset (modified MNIST) demonstrated that removing rotationsensitive components led to a classification accuracy decrease of no more than 25–30% across latent space dimensions 3–10 (e.g., normalized accuracy dropped from 1.000 to 0.704 at d = 7). Reconstruction experiments showed that the semantic shape of digits was preserved, while specific orientation information was suppressed. The scientific novelty of this work lies in introducing a simple and reproducible method for removing orientation-related features from the latent space of an autoencoder without modifying the model architecture or introducing specialized regularizers. The practical significance of the method is in reducing the influence of arbitrary object orientation on recognition accuracy, thereby increasing the universality and reliability of vision systems in uncontrolled settings. The proposed approach may be useful for building classifiers capable of handling images with varying or unknown orientations during data collection.

Ключові слова

variational autoencoder, feature disentanglement, rotation invariance, semantic representation, convolutional architecture, image classification, algorithms, machine learning

Бібліографічний опис

Bedratiuk A. Method for obtaining rotation-invariant image representation by removing orientation features from autoencoder latent space / A. Bedratiuk // Computer Systems and Information Technologies. – 2025. – № 2. – P. 112-122.

URI

https://elar.khmnu.edu.ua/handle/123456789/19323

Зібрання

CSIT - 2025 рік

Повна інформація про документ