DBI-Angebot

Stucture Analysis and Retrodigitization of Scientific Documents

Proposer and Project Partner:Institute for Experimental Mathematics of the University of Essen
(Professor Dr. Gerhard Michler)
and
Department of Information Engineering, Faculty of Engineering, Shinshu University, Nagano, Japan (M.Okamoto)
and
Graduate School of Mathematics, Kyushu University, Fukuoka, Japan
Project DescriptionStucture Analysis and Retrodigitization of Scientific Documents
Patron InstitutionDeutsche Forschungsgemeinschaft
Patron Support ProgrammeLibrary Support Programme 'International Cooperation' in the funding area 'Distributed Digital Research Library'
Duration2000 - 2002
Short Description or Reference to Initial Projects In the period 1.4.97 until 30.9.00 the DFG has provided financial support for the applicant's research project "Retro-Digitalisierung der Zeitschrift Archiv der Mathematik". In order to apply the methods developed by the Essen study group for the retrodigitization of many different scientific jounals a deeper understanding of the OCR technology appears to be necessary.
Both Japanese research groups of Professor Okamoto (Nagano) and Professor Suzuki (Fukuoka) have long experience in producing special OCR programs for structure analysis and recognition of scientific documents. However, both OCR systems have not been designed for the special purpose of the retrodigitization of many volumes of different mathematical or other scientific research journals.
The Essen study group has used successfully Okamoto's special mathematical formula recognizer EXP for the retrodigitization of 6 volumes of the mathematical journal "Archiv der Mathematik". However substantial extensions of the developed retrodigitization software are necessary. The expected new programs for the separation of the mathematical formulas from the ordinary text of a scanned page will improve the success rate of the recognized text substantially.
The principal investigators of the above 3 research groups have agreed to cooperate in the period 1 October 2000 until 30 September 2002 in the research area of structure analysis and retrodigitization of scientific documents. Their joint research will be devoted to the following partial projects.
- Extensions of special OCR software for the recognition of mathematical formulas and other scientific symbols or diagramms.
- Development of practical algorithms for the separation of the mathematical formulas from the ordinary text of a scanned page.
- Segmentation programs for the resolution of touching characters within ordinary texts or mathematical formulas.
Contact Professor Dr. G. Michler
Institute for Experimental Mathematics
University of Essen
Ellernstraße 29
45326 Essen
URL --

Stand: Oktober 2000

top