![]() ![]() Tesseract supports a very large set of languages including: Italian, French, Spanish, German, Arabic, Simplified and Traditional Chinese and many others ( ). For each image a curation task allows to extract its text representation in hOCR format for full-text indexing in SOLR. Out-of-box, the module supports the open-source Tesseract OCR engine ( ). The OCR module enables the integration of DSpace with an external Optical Character Recognition software.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |