Method for obtaining rotation-invariant image representation by removing orientation features from autoencoder latent space
Вантажиться...
Дата
2025
Автори
Назва журналу
Номер ISSN
Назва тому
Видавець
Хмельницький національний університет
Анотація
In many computer vision tasks, accurate object recognition is complicated by arbitrary object orientations. Ensuring rotation invariance is critical for improving classification accuracy and reducing errors related to the varying placement of objects.
This issue is particularly important in real-world environments, where object orientation is rarely controlled.
The goal of this study is to develop a method that allows separating rotational features from the semantic essence of an
object, while preserving high classification accuracy after removing orientation-related components. This approach enables the construction of models that remain effective under a wide range of input perspectives, thus improving robustness in practical applications.
The proposed method is based on using a convolutional variational autoencoder trained on a dataset of images subjected
to various rotation angles. Linear regression is then used to identify those latent components that correlate most strongly with the rotation parameter. These components are removed, and the remaining features are used for classification. Additionally, image reconstruction is performed from the reduced latent vector to visually validate rotation invariance and evaluate the preservation of object shape.
Experiments on a synthetically rotated binarized digit dataset (modified MNIST) demonstrated that removing rotationsensitive components led to a classification accuracy decrease of no more than 25–30% across latent space dimensions 3–10 (e.g., normalized accuracy dropped from 1.000 to 0.704 at d = 7). Reconstruction experiments showed that the semantic shape of digits was preserved, while specific orientation information was suppressed.
The scientific novelty of this work lies in introducing a simple and reproducible method for removing orientation-related
features from the latent space of an autoencoder without modifying the model architecture or introducing specialized regularizers. The practical significance of the method is in reducing the influence of arbitrary object orientation on recognition accuracy, thereby increasing the universality and reliability of vision systems in uncontrolled settings. The proposed approach may be useful for building classifiers capable of handling images with varying or unknown orientations during data collection.
Опис
Ключові слова
variational autoencoder, feature disentanglement, rotation invariance, semantic representation, convolutional architecture, image classification, algorithms, machine learning
Бібліографічний опис
Bedratiuk A. Method for obtaining rotation-invariant image representation by removing orientation features from autoencoder latent space / A. Bedratiuk // Computer Systems and Information Technologies. – 2025. – № 2. – P. 112-122.