Due to memory limitations of the browser the files are read in several chunks.
By Frode Eika Sandnes, OsloMet, March, 2025
Batch convert pdf documents to json with file info, text contents, page info and figure info.
Image (url-data) data are stored in separate json-file.
For very many large pdf files run several times and join the parts using the separate json-joiner tool.
Use Contrl/Shift to select multiple files (Contrl-A for all).
Reading pdf files and placing contents in temporary indexdb....should be quite quick!
Setting up local indexdb for temporary storing partial results. This may take a while....
Extracting the text and image contents.
Current file:
Status: / reports processed
Please wait for the indexdb in the browser to be cleaned....this could take a while...
Please find the two json files containing the text contents and image contents in your download folder.
Reload to convert more pdf-documents.
Use the json-merging tool if you need to combine several results.