I am fetching PDF file from the S3 bucket using PDF.js Express Web viewer and saving annotations, however without S3 access key and secret key. Now I required to READ and SAVE PDF object using AWS S3 bucket access key and secret key
So how can I pass S3 bucket access key and secret key, while reading and saving the PDF object.
PDF.js Express will not be able to read/write files from s3 that require secret access keys. It is only able to read remote files that support CORS and are publicly accessible.
To do this, you would have to download the document in your own code outside of PDF.js Express, and then pass the document into the loadDocument API.
The loadDocument api allows you to pass headers that will be forwarded when we try to fetch the document from the URL you provide. If the authentication happens via request headers, this option might work for you. In your case it might look something like this:
Keep in mind this will still only work if the file is publicly accessible. If it is not, you have to fetch the file first (probably using the S3 SDK) and then passing the resulting blob into loadDocument
I am going to try with customHeaders first by passing my S3 acess_key and secret_key, Then if It won’t work I would check with loadDocument API.
Let you know in case of any other query.
I checked by enabling the public READ Access to my S3 bucket PDF file and also applied the required CORS policy to S3 bucket, such as below:
customHeaders: {
apiKey: [apiKey],
secretKey: [secretKey]
}
Yes, I can read the PDF file, however if I remove S3 credentials from custom headers then also I can read the file, due to same file having a public read access.
So what is the meaning to pass S3 credentials if you gives a public read access to the file?
As I mentioned before, PDF.js Express can only read public files. Your s3 apiKey and secretKey are not valid headers for s3, which is why it works with or without them.
The apiKey and secretKey you have are used to download files programmatically via the s3 SDK. If you wanted to keep your files private but still use them with Express, you would have to use the SDK to download the blob and then pass it to Webviewer.
This might look something like this:
import AWS from 'aws-sdk';
import WebViewer from '@pdftron/pdfjs-express'
AWS.config.update({
region: 'your region',
credentials: {YOUR_CREDENTIALS}
});
WebViewer(...).then(async instance => {
const s3 = new AWS.S3();
const body: BlobPart = await new Promise(resolve => {
s3.getObject({
Bucket: 'YOUR_BUCKET',
Key:'KEY_OF_FILE',
}, (err, resp) => {
if (err) {
console.log(err)
}
resolve(resp.Body as BlobPart)
})
});
const file = new Blob([body], {type: 'application/pdf' });
instance.loadDocument(file);
})
Yes, Amazon provides a JS SDK to fetch files from your S3 bucket. However, this has nothing to do with the PDF.js Express REST API so I am unsure of your question.
The PDF.js Express REST API is for merging/extracting annotations from a PDF, and a few other useful tools.
Keep in mind this will still only work if the file is publicly accessible. If it is not, you have to fetch the file first (probably using the S3 SDK) and then passing the resulting blob into loadDocument
So that’s the reason, I asked like we have to use PDF.js Express REST API to the loads the document/pdf using “loadDocument”, after we fetch the S3 object using SDK.
Please Confirm