Is it possible to partial rendering of pdf file?

Hi , can i do partial rendering of pdf file.
Suppose pdf size is 60-100 mb then it will take time to load on viewer so is it possible to render only first 4 pages and as user scroll down then remaining pages will load.

1 Like

Hi there,

PDF.js Express already does this out of the box (if possible). This kind of loading only works if the PDF is linearized, however.

Thanks!
Logan

Hi Logan.

I am trying to get range requests working and I believe I have set my server up to support this, however the web viewer always downloads the documents in full.

To test this I have done the following:

This is a 2GB linearized document https://s3.amazonaws.com/pdftron/downloads/pl/2gb-sample-file.pdf

Using PDF.js:
https://mozilla.github.io/pdf.js/web/viewer.html?file=https%3A%2F%2Fs3.amazonaws.com%2Fpdftron%2Fdownloads%2Fpl%2F2gb-sample-file.pdf

You can see the partial requests (status 206) flowing in. So this works as expects

Using PDFTron:
https://www.pdftron.com/webviewer/demo/
Choose File > paste in link

Again, the range requests are working as expected

Using PDF.js Express:
https://pdfjs.express/demo
Try Your Own File > paste link in

This time, no range requests are made and the viewer tries to download the whole file. And its the same on my instance of PDFjs Express

I am wondering if I need to do something special to turn this feature on - or if this is a bug?

Thanks, Luke

1 Like

Hey there! You shouldn’t have to do anything to enable this functionality.

We can reproduce this issue and will investigate.

Thanks!
Logan

1 Like

Hi.

I’m wondering whether you have any further on this issue. I’m facing the same issue and seeking to debug it in the dev console but with limited success.

It would be good to confirm my assumptions about what the viewer is expecting in order to use byte range support. It looks to me like the requirement is for the site to return the following headers:

  • Content-Encoding: “identity” or not supplied
  • Accept-Ranges: “bytes”
  • Content-Length: Must return an value parsable by parseInt that is a number.

Plus the content length header’s value must be > 2 x the chunk size and you must be using http/https as the URL protocol.

There also seems to be support for disabling the use of ranges and setting the range size, but I haven’t spent the time to figure out whether/how they can be configured.

Hi Logan,

Just checking in to see if you have any updates about this issue.

Cheers, Luke

Hi there,

There hasn’t really been any progress on this yet as some higher priority things have came up. We should have some resources freed up soon and then we will start investigating.

I will keep you updated!

Hey everyone,

Just a heads up that we found the issue and it will be fixed in the next patch, hopefully some time this week.

Thank you for your patience on this one.

Logan

3 Likes

This has been fixed!

1 Like

Hi Logan.

I’m sorry to report this but it seems that the work is incomplete. I’m seeing a slew of “Bad end offset” messages and what appear to be multiple retrievals of the entire document:

I’m also attaching a dev console screenshot at the time the error is thrown that will hopefully be very helpful.

Hey there!

I did a bit of research on this one and it seems like it may be due to an issue with Chrome. Do you get this error in FireFox? Also can you try clearing your cache and see if it fixes it?

Also if you could give me the URL of the document you’re trying to load, that would be great!

Thanks,
Logan

Thanks for the quick reply, Logan.

The URL of the PDF is on a staging server for the site we’re developing. Can I PM you the details?

Yup, absolutely!

Logan

Hi Logan,

I can confirm that range requests are now working for me.

However, it seems that the initial request (with response status 200) seems to continue downloading alongside the range requests. Is this the intended behaviour? My understanding is that the first request should be canceled if range requests are initiated.

Here is a screenshot demonstrating the current behaviour.

Whereas this is a screenshot of how PDFTron handles it, which is more like what I would expect.

Also I noticed that PDFTron’s inital request recieves a 206 response instead of 200. Likely due to the presence of the range header on that initial request.

PDFTron inital request
image

PDFjs Express inital request
image

Cheers, Luke.

Hey there,

This is the expected behaviour of PDF.js. It downloads range requests on demand/when needed, and in parallel downloads the entire document. I am not sure why they implemented it this way, but this is intended.

That being said, I will be adding an option in the future to disable the extra downloading. Keep an eye out in a future release.

Thanks!

1 Like

Thanks for clarifying that Logan

1 Like

Is this issue solved? It seems range request not work on official demo or my project(8.7.0)
PDF: https://s3.amazonaws.com/pdftron/downloads/pl/2gb-sample-file.pdf
demo: PDF.js Viewer Demo | PDF.js Express