Index
Improving AFFGANwriting by Exploring Deep Learning Models for Style Encoders and Image Generation of Sentence-Level Handwriting
Handwriting generation is a fundamental task in computer vision and natural language processing, with applications in personalized content generation and so on. The AFFGANwriting model presents a generative framework for synthesizing word-level handwritten images by fusing multistyle features using a GAN-based approach with a VGG-style encoder. However, its scope is limited in two ways:
• It only generates individual word images
• It used a fixed VGG backbone which may not capture style semantics as effectively as more modern alternatives such as CNN and transformer models (e.g. EfficientNet, ResNet, DINO).
With an increasing demand for personalized handwriting synthesis across longer text spans, there’s a clear motivation to explore if advanced backbone models can improve the feature extraction of the style. In addition, there is need to extend the generative capacity from words to full sentences and to interact ideally in a user-friendly interactive system.
Research questions
• Can more recent feature extractors like CNN and transformers (EfficientNet, ResNet, DINO) outperform VGG in capturing style-relevant features for handwriting generation?
• What are the architectural or training modifications required to extend AFFGANwriting from word-level to sentence-level image synthesis?
• How can the model be integrated into an intuitive web application that allows users to select a writing style and input arbitrary text for sentence-level generation?
Goal
To enhance AFFGANwriting’s quality and flexibility in handwriting image generation by:
• Upgrading the style encoder
• Enabling sentence-level synthesis
• Deploying the system as a web app for user interaction
Plug-and-Play Diffusion Models for Magnetic Resonance Imaging
Gridline Suppression in X-Ray Imaging Using Global Feature-Augmented U-Nets
Synthetic Non-Contrast CT Angiography Image Generation using Deep Learning Methods
Advanced Machine Learning Models for Leakage Detection and Localization in Water Distribution Networks Using Real-System Data
Evaluating Large Language Models Using Gameplay (ClemBench)
Exploring Species-level Similarity in Bayesian Stimulus Priors of Artificial Intelligent Agents
Deep Learning-Based Classification of Body Regions in Intraoperative X-Ray Images
Diffusion Transformer for CT artifacts compensation
Computed Tomography (CT) is one of the most important modality in modern medical imaging, providing invaluable cross-sectional anatomical information crucial for diagnosis, treatment planning, and disease monitoring. Despite its widespread utility, the quality of CT images can be significantly degraded by various artifacts arising from physical limitations, patient-related factors, or system imperfections. These artifacts, manifesting as streaks, blurs, or distortions, can obscure critical diagnostic details, potentially leading to misinterpretations and compromising patient care. While traditional iterative reconstruction and early deep learning methods have offered partial solutions, they often struggle with complex artifact patterns or may introduce new inconsistencies. Recently, diffusion models have emerged as a powerful generative paradigm, demonstrating remarkable success in image synthesis and restoration tasks by progressively denoising an image from a pure noise distribution. Concurrently, Transformer architectures, with their inherent ability to capture long-range dependencies via self-attention mechanisms, have shown promise in various vision tasks. This thesis investigates the potential of Diffusion Transformer, for comprehensive CT artifact compensation. By synergizing the iterative refinement capabilities of diffusion models with the global contextual understanding of Transformers, this work aims to develop a robust framework capable of effectively mitigating a wide range of CT artifacts, thereby enhancing image quality and improving diagnostic reliability. This research explores the design, implementation, and rigorous evaluation of such a model, comparing its performance against existing state-of-the-art techniques.