UTF-8 encode and decode is not working on rest api /xfdf/merge

SunnyTokyo · March 23, 2021, 7:42am

Hi there! there seems encode/decode bug on your rest api function.
We are handling Japanese UTF-8 annotations, and WebViewer correctly responds Japanese annotations when loading the xfdf file together with the PDF file.
But, when we call https://api.pdfjs.express/xfdf/merge and try merging the xfdf file into the PDF, the responded pdf file contains garbled text annotations. Code is as below. And the xfdf file and the pdf link are attached together with screenshots.

Can you check it on your end? Thx!

import requests
import json
api = ‘https://api.pdfjs.express/xfdf/merge’
pdf = ‘https://ufocatch.com/1.pdf’
f = open(‘tmp/rest_api_test/xfdf/1.xml’, ‘r’, encoding=‘UTF-8’)
xfdf = f.read()
data = {
‘file’: pdf,
‘xfdf’ : xfdf,
}
r = requests.post(api, files={“dummy”:(None, ‘dummy’)}, data=data)
res = json.loads(r.text)
r=requests.get(res[‘url’], headers={“Authorization”:res[‘key’]})
with open(‘tmp/rest_api_test/out1.pdf’, ‘wb’) as f:
f.write(r.content)

Logan · March 29, 2021, 5:43pm

Hey there!

Sorry for the delay on this one.

I’ll investigate this and get back to you soon.

Thanks!
Logan

Logan · April 1, 2021, 5:13pm

Hey! Just a heads up that I think I identified the issue and am working on a fix. I’ll let you know when it’s pushed.

SunnyTokyo · April 2, 2021, 1:03am

Great! Looking forward to the fix!!

Logan · April 6, 2021, 9:08pm

Hi @SunnyTokyo,

Just a heads up that this has been fixed and pushed into production. You shouldn’t have to make any changes to your code.

Let me know if you have any issues!

Thanks,
Logan

SunnyTokyo · April 7, 2021, 9:37am

Looks like it’s getting better!! but, the hover pop-up is still garbled! Just a further step, pls ;-

Logan · April 8, 2021, 2:53pm

Ahh sorry I missed that I’ll look into it!

Logan · April 16, 2021, 9:01pm

Hi there,

After a bit of investigation, this is a bug in the software you use to open the document, and not PDF.js Express or its API. If you open the document in Acrobat or a different reader, the text appears fine.

Thanks!
Logan

SunnyTokyo · April 18, 2021, 7:52am

Oh, we didn’t noticed that… Confirmed Acrobat shows the correct characters. Thx again!!

Topic		Replies	Views
Merge charset wrong Technical Support	3	537	April 5, 2021
Merging XFDF using the Express REST API using URL Technical Support	8	963	August 21, 2020
Wrong encoding inside PDF Technical Support	2	436	December 13, 2021
Merge XFDF API giving [502] Internal error on merge request Technical Support	4	151	July 18, 2023
Merging XFDF using the Express REST API Technical Support	4	759	July 6, 2020

UTF-8 encode and decode is not working on rest api /xfdf/merge

Related topics