UTF-8 encode and decode is not working on rest api /xfdf/merge

sample.zip (1.4 MB)

Hi there! there seems encode/decode bug on your rest api function.
We are handling Japanese UTF-8 annotations, and WebViewer correctly responds Japanese annotations when loading the xfdf file together with the PDF file.
But, when we call https://api.pdfjs.express/xfdf/merge and try merging the xfdf file into the PDF, the responded pdf file contains garbled text annotations. Code is as below. And the xfdf file and the pdf link are attached together with screenshots.

Can you check it on your end? Thx!

import requests
import json
api = ‘https://api.pdfjs.express/xfdf/merge
pdf = ‘https://ufocatch.com/1.pdf
f = open(‘tmp/rest_api_test/xfdf/1.xml’, ‘r’, encoding=‘UTF-8’)
xfdf = f.read()
data = {
‘file’: pdf,
‘xfdf’ : xfdf,
}
r = requests.post(api, files={“dummy”:(None, ‘dummy’)}, data=data)
res = json.loads(r.text)
r=requests.get(res[‘url’], headers={“Authorization”:res[‘key’]})
with open(‘tmp/rest_api_test/out1.pdf’, ‘wb’) as f:
f.write(r.content)

Hey there!

Sorry for the delay on this one.

I’ll investigate this and get back to you soon.

Thanks!
Logan

Hey! Just a heads up that I think I identified the issue and am working on a fix. I’ll let you know when it’s pushed.

Great! Looking forward to the fix!!

Hi @SunnyTokyo,

Just a heads up that this has been fixed and pushed into production. You shouldn’t have to make any changes to your code.

Let me know if you have any issues!

Thanks,
Logan

Looks like it’s getting better!! but, the hover pop-up is still garbled! Just a further step, pls ;-
image

Ahh sorry I missed that :laughing: I’ll look into it!

Hi there,

After a bit of investigation, this is a bug in the software you use to open the document, and not PDF.js Express or its API. If you open the document in Acrobat or a different reader, the text appears fine.

Thanks!
Logan

Oh, we didn’t noticed that… Confirmed Acrobat shows the correct characters. Thx again!!

1 Like