DKT Workflow
The tool provides a fully automated workflow for the creation of metadata packages compliant with the National Digital Library standards for the digitisation of broadside ballads. Based on a batch of input images, the system automatically detects barcodes, separates images into individual items, ensures the correct ordering of pages, and enriches the records with bibliographic metadata retrieved from the library catalogue via the Z39.50 protocol. The tool converts metadata into MODS, METS, and ALTO formats, performs optical character recognition using the PERO OCR engine, converts image files into JPEG2000 format, and generates a complete package ready for ingestion into the Kramerius digital library and long-term digital preservation systems. The fully automated workflow eliminates manual intervention, significantly reduces time and staffing requirements, and minimises the risk of human error when processing large volumes of small printed materials.
PDF Curator
PDF Curator is a tool designed to convert any PDF file into a structured JSON format. Using OCR, Layout Detection, and Image Captioning techniques, the resulting JSON file captures key document elements, including:
- Text from individual pages
- Coordinates of text blocks
- Chapter headings and chapter lists
- Coordinates and descriptions of non-text elements
- A list of non-text elements, and more.
Libri Augmentati
This software downloads all available (or selected) data and metadata, like MODS, FOXML, DC, ALTO, and images, from digital libraries based on the system Kramerius and stores it on the local system. This software also uses services provided by the LINDAT/CLARIAH-CZ Research Infrastructure (https://lindat.cz "LINDAT/CLARIAH-CZ Research Infrastructure"), supported by the Ministry of Education, Youth and Sports of the Czech Republic (Project No. LM2023062).
Marc Comparator
The MARC Comparator Project is a comprehensive system for processing MARC bibliographic records, providing both a Python SDK and a full-featured backend-client application. It enables validation, comparison, and authority linking of MARC records, with support for external catalog Aleph. The React client provides an intuitive interface for users to interact with the backend.