WDX plugin pdfOCR - Show details of PDF files

Post by *petermad » 2022-12-31, 15:32 UTC

2Usher

but they might be disabled globally when the board software was updated

My profiles still has it enabled after board software updates.

Usher · Post by *Usher » 2022-12-31, 15:47 UTC

2white
You are right, the problem with ban per user may still exist. There is an option in user profile to define friends and foes, but:

phpBB wrote:Private messages from foes are still permitted.

RalfTC · Post by *RalfTC » 2023-01-01, 10:03 UTC

Ah, just found the settings options in my profile

Thx to all!

PM is underway.

Usher · Post by *Usher » 2023-01-01, 17:30 UTC

2RalfTC
I've got it, waiting for the package

Usher · Post by *Usher » 2023-01-01, 21:33 UTC

2RalfTC
Well, your "document" is just a copy of a web page printed to PDF using NitroPDF software (either as a browser plugin or as a virtual printer). It contains default text header (page title, empty, page url) and default text footer (page No of Total, empty, print timestamp). The web page for unknown reason is saved as a picture with screenshot in it. I haven't tested NitroPDF - maybe it's another default setting.

As you can see, it's not an original invoice printed to PDF from a billing program. It's not even a good screenshot - the right side of the invoice is cut off. However, this PDF contains both text and picture on a single page so the pdfOCR plugin properly shows needOCR=0 as a number of pages containing only pictures.

gammabubble · Post by *gammabubble » 2024-05-28, 16:27 UTC

The plugin was working perfectly on my Windows 11 system but suddenly broke. No major changes have been made to the system. It now shows all PDF files with need OCR as -71 and total pages as -4.

I uninstalled completely and reinstall Total Commander as well as removed/reinstall plugin several times but it still shows the same negative values.

Is there anything else anyone can suggest? Thanks.

tuska · Post by *tuska » 2024-05-28, 17:51 UTC

gammabubble wrote: 2024-05-28, 16:27 UTC Is there anything else anyone can suggest?

Well, ...

cpd.bat
::verz 1: ne radi sa unicode imenima
; ::verz 1: does not work with unicode names

pdfOCR 0.9 wrote:Limitations:
- Unicode file names – in this version they are not supported, so please use only ANSI names.
If non ANSI names are used the numbers of pages will be negative or very high number.
- Speed – plugin is relatively slow, so when you activate this plugin in a panel of Total Commander
please be patient until the analyzing is finished and you get your cursor ready again.

pdfOCR 0.9 wrote:Bugs:
negative page numbers or very high page numbers: that usually happen if pdf is not properly formatted.
In that case the following procedure is suggested to try:
1) open the pdf file in any pdf reader that can read pdf and re-save the pdf file
2) rename the offending pdf file temporarily with active plugin to force it to reread it.

⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺
Example
2019-11-01.pdf renamed to: ...

Name/Ext.

_2019-11-01.pdf

[=pdftrebaocr.totalPages] -> totalPages -3
[=xpdfsearch.Number of Pages] -> totalPages: 116 xPDFSearch 1.41 - Content plugin to search text in PDF files

UNICODE character

results in totalPages -3 in plugin pdfOCR 0.9.

After a renaming, I also noticed that with a negative value in "totalPages"
the content of column "needOCR" was changed from 1 to 0.
A TC restart (cm_exit 9) brought the same result after applying the 'Custom Columns view'.

Furthermore, when this pdf file was renamed back in the 'Custom Columns view' in the next(!) (underlying) file,
both the content of the "totalPages" column was changed from 148 to -3 as well as the content of the "needOCR" column
from 0 to 1, although this PDF file has NOT been renamed and NO UNICODE character was present in the file name!

Only a TC restart with renewed application of the 'Custom Columns view' caused the data to be corrected.

This example confirms that the plugin "pdfOCR 0.9" does NOT work with UNICODE file names!

In a directory with 80 PDF files, the values 0, 1, 2, 3, 4, 6 were displayed in the "needOCR" column.
I was able to search those PDF files with a value >0 for text without any problems (random tests carried out).
SumatraPDF v.3.5.2 64-Bit

⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺
Plugin: pdfOCR 0.9 | pdftrebaocr

totalPages	[=pdftrebaocr.totalPages]
needOCR	[=pdftrebaocr.needOCR]
password	[=pdftrebaocr.password]

⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺⸺

My conclusion:
In its current version, this plugin is not suitable for me.

gammabubble · Post by *gammabubble » 2024-05-31, 01:05 UTC

Thanks @Tuska for the feedback.

If the pdfOCR plugin crashes for some reason, Total Commander won't launch it even after you close and re-open Total Commander. I had completely reboot the computer and then re-launch Total Commander which got the pdfOCR working again. But this worked for few times but the issue now returned back with needOCR value of -71 and total pages as -4 on all the PDFs. Reboot of computer is not helping this time again.

I wonder if cpd.bat or other needed file is being blocked by Windows Defender Smartscreen somehow which is causing the issues here.

Total Commander

WDX plugin pdfOCR - Show details of PDF files

[OT] WDX plugin pdfOCR - Show details of PDF files

Re: WDX plugin pdfOCR - Show details of PDF files

Re: WDX plugin pdfOCR - Show details of PDF files

Re: WDX plugin pdfOCR - Show details of PDF files

Re: WDX plugin pdfOCR - Show details of PDF files

Re: WDX plugin pdfOCR - Show details of PDF files

Re: WDX plugin pdfOCR - Show details of PDF files

Re: WDX plugin pdfOCR - Show details of PDF files