PDF.js Express Version
|UI version|‘8.1.0’|
|Core version|‘8.1.0’|
|Build|‘OS8yMy8yMDIxfGM3YjM4YzBmOQ==’|
PDFTron Version
|UI version|‘8.1.0’|
|Core version|‘8.1.0’|
|Build|‘OS85LzIwMjF8NTU4Zjg4N2Fk’|
Detailed description of issue
If i select or highlight a text in a pdf, not all characters are selected or highlighted.
It also happens if i upload the pdf in your demo application.
The weird thing is. I tried the same pdf (and different pdfs as well) with PDFJSExpress Plus and PDFTron (both which can be free downloaded, the demo version) and the problem only happens in PDF.js Express Plus. In PDFTron the issue isn’t happening. The Screenshot from the expected behaviour is from PDFTron and the screenshot from the issue is from PDF.js Express Plus
Expected behaviour
I expect that all characters which are selected are highlighted.
Thank you for the detailed bug report. This is a known issue - it actually stems from the core rendering engine (PDF.js) which we do not normally support - however I will investigate this and see if I can find a fix.
No, PDFTron uses a custom rendering engine built in-house for the last 20 years. It has much higher accuracy and better text parsing, which is why the text select is better in some scenarios.
As mentioned, text selection is a function of the PDF.js Core library, which we do not support.
I tried to resolve this issue by digging into the PDF.js Core, but could not come up with a good solution - the library is simply providing us invalid text location data. This can happen for a wide variety of reasons and it is impossible to come up with a single generic fix.
We will continue investigating text select issues over time, but I cannot guarantee when we will fix it or if it will ever be fixed.