Unsupervised learning of unknown fonts: requires only document images and a corpus of text. Ability to handle noisy documents: inconsistent inking, spacing, vertical alignment, etc. Support for ...