Spatio-Temporal Handwriting Imitation

Symbolic picture for the article. The link opens the image in a large view.

June 8, 2020

Contribution

To the best of our knowledge, it is the first paper that synthesizes realistic Western cursive handwriting by mimicking the actual temporal writing process of a single handwritten line. Our method works nearly fully automatic, e.g. no manual cutting out of characters, and is able to render full lines instead of glyphs/words as in other works.

Possible use cases we envisioned

Enable physically impaired persons to write in their personal handwriting,
Send personal greeting notes in your handwriting without writing it on a paper and scanning it.
Personalize game/virtual environment experience by using your (or a famous person’s) handwriting.
Generate handwriting to train modern text recognition engines, which commonly need a lot of training samples.
Enable forensic researchers to generate and imitate handwriting without the need of a person. Thus, create more robust writer identification methods.

Limitations

At its current stage the generation works quite robust but has problems with artifacts, missing punctuation marks and synthesizing writers w. bad handwriting.
The imitation of the general writing style works well, as well as the pen imitation. However the stroke imitation is still limited (this can mainly be attributed to the method of A. Graves [1] which we used for this part, s. below). Thus, most people and esp. forensic experts will notice the difference to the original.
Currently, the method needs the transcription of the line to be imitated.
Currently the background isn’t simulated.

Overview of our current proposed system

Authors: Martin Mayr, Martin Stumpf, Anguelos Nicolaou, Mathias Seuret, Andreas Maier, Vincent Christlein
Preprint: https://arxiv.org/abs/2003.10593
Code: https://github.com/M4rt1nM4yr/spatio-temporal_handwriting_imitation

From an offline sequence, we create a temporal one, which is used for a stroke style modification (top path). Eventually, we transfer it back to the offline domain adapted to the image style, i.e. pen style, color (bottom path).

Offline to online conversion

First, we need to create a robust skeleton. To be able to train the system with online data, we developed an iterative knowledge transfer which can be seen as an unrolled CycleGAN. This avoids cycle-consistency that would allow a stroke modification. It uses a naive mapping to generate a more general mapping of offline handwriting to a handwriting skeleton. We show that this is more robust than a classical skeletonization.

The second step is to generate the stroke sequence. We were unable to train the system with constant resampling, thus we propose ‘maximum acceleration resampling’ which samples points on straight lines in a sparse manner.

Content transfer

Here we made use of the seminal work by Alex Graves [1].

Pen style transfer

For the image/pen style transfer, we modify the well-known pix2pix framework to incorporate the style information.

Results

We created a user study, which is online below https://forms.gle/MGCPk5UkxnR23FqT9 . We would like to encourage everyone to still participate so that we get more data to draw statistics.

Main results:
1. In a Turing test, the people truly assigned our faked samples as such by only 50.3% (67.7% real samples as real samples).
2. Also a writer identification system assigned a faked paragraph by 25% as being written by the person to be imitated.

References:
[1] “Generating Sequences With Recurrent Neural Networks” https://arxiv.org/abs/1308.0850