| 1. |
Byra M., Rachmadi M.♦, Skibbe H.♦, Few-shot medical image classification with simple shape and texture text descriptors using vision-language models,
BULLETIN OF THE POLISH ACADEMY OF SCIENCES: TECHNICAL SCIENCES, ISSN: 0239-7528, DOI: 10.24425/bpasts.2025.153838 , Vol.73 (3), No.153838, pp.1-8, 2025 Abstract: Deep learning methods are gaining momentum in radiology. In this work, we investigate the usefulness of vision-language models (VLMs) and large language models for binary few-shot classification of medical images. We utilize the GPT-4 model to generate text descriptors that encapsulate the shape and texture characteristics of objects in medical images. Subsequently, these GPT-4 generated descriptors, alongside
VLMs pre-trained on natural images, are employed to classify chest X-rays and breast ultrasound images. Our results indicate that few-shot classification of medical images using VLMs and GPT-4 generated descriptors is a viable approach. However, accurate classification requires the exclusion of certain descriptors from the calculations of the classification scores. Moreover, we assess the ability of VLMs to evaluate shape
features in breast mass ultrasound images. This is performed by comparing VLM-based results generated for shape-related text descriptors with the actual values of the shape features calculated using segmentation masks. We further investigate the degree of variability among the sets of text descriptors produced by GPT-4. Our work provides several important insights about the application of VLMs for medical image analysis. Keywords: medical image classification, ision-language models, arge language models, ew-shot learning Affiliations:
| Byra M. | - | IPPT PAN | | Rachmadi M. | - | other affiliation | | Skibbe H. | - | other affiliation |
|  |
| 2. |
Byra M., Poon C.♦, Rachmadi Muhammad F.♦, Schlachter M.♦, Skibbe H.♦, Exploring the performance of implicit neural representations for brain image registration,
Scientific Reports, ISSN: 2045-2322, DOI: 10.1038/s41598-023-44517-5, Vol.13, No.17334, pp.1-13, 2023 Abstract: Pairwise image registration is a necessary prerequisite for brain image comparison and data integration in neuroscience and radiology. In this work, we explore the efficacy of implicit neural representations (INRs) in improving the performance of brain image registration in magnetic resonance imaging. In this setting, INRs serve as a continuous and coordinate based approximation of the deformation field obtained through a multi-layer perceptron. Previous research has demonstrated that sinusoidal representation networks (SIRENs) surpass ReLU models in performance. In this study, we first broaden the range of activation functions to further investigate the registration performance of implicit networks equipped with activation functions that exhibit diverse oscillatory properties. Specifically, in addition to the SIRENs and ReLU, we evaluate activation functions based on snake, sine+, chirp and Morlet wavelet functions. Second, we conduct experiments to relate the hyper-parameters of the models to registration performance. Third, we propose and assess various techniques, including cycle consistency loss, ensembles and cascades of implicit networks, as well as a combined image fusion and registration objective, to enhance the performance of implicit registration networks beyond the standard approach. The investigated implicit methods are compared to the VoxelMorph convolutional neural network and to the symmetric image normalization (SyN) registration algorithm from the Advanced Normalization Tools (ANTs). Our findings not only highlight the remarkable capabilities of implicit networks in addressing pairwise image registration challenges, but also showcase their potential as a powerful and versatile off-the-shelf tool in the fields of neuroscience and radiology. Affiliations:
| Byra M. | - | IPPT PAN | | Poon C. | - | other affiliation | | Rachmadi Muhammad F. | - | other affiliation | | Schlachter M. | - | other affiliation | | Skibbe H. | - | other affiliation |
|  |