1 Reader

You can load a PDF document into a JavaScript PDF object from any InputStream convertible object (File, Blob, URL, base64 string)

Copy
<script>
    var pdf = new Ax.pdf.Reader(new Ax.net.URL('https://bitbucket.org/deister/axional-docs-resources/raw/master/PDF/sample.pdf')); 
    console.log(pdf.getNumberOfPages()); 
</script>
2

You can use PdfDocument methods to interact with PDF.

Return Method Description
Blob toBlob Returns the document as a java.sql.Blob
Reader join(Reader other) Joins current pdf with provided returning a new Reader
ArrayList<Reader> split(int size) Splits the pdf into chunks of size pages
Reader slice(int start, int end) Selects the pages starting at the given start argument, and ends at the given end argument.
Reader rotate(int page, double angle) Rotates specific page by angle and returns the new document
int getNumberOfPages() Returns the number of pages
int getNumberOfImages() Returns the number of images
String getTextFormPage(int page) Returns the text layer for given page
ArrayList<String> getTextFromDocument() Returns an array of texts for each page in the document
ArrayList<PdfImage> getImages() Returns an array of document images
Reader insertPage(int position) Inserts a new blank page in position. Position 1 represents before any page in document.
Blob toBlob() Returns PDF Document as Blob

2 Splitting documents

The split method breaks the PDF document pages into documents of given size.

For example, to split a 12 month calendar in two pieces of 6 month each.

Copy
<script>
    var pdf = new Ax.pdf.Reader(new Ax.net.URL('https://bitbucket.org/deister/axional-docs-resources/raw/master/PDF/calendar-2020.pdf'));
    // split in 6 page docs
    var docs = pdf.split(6);
    // Return as a resultset
    return new Ax.rs.Reader().build(docs);
</script>

3 Slicing documents

The slice method selects the pages starting at the given start argument, and ends at the given end argument.

For example to extract months from calendar months October to December:

Copy
<script>
    var pdf = new Ax.pdf.Reader(new Ax.net.URL('https://bitbucket.org/deister/axional-docs-resources/raw/master/PDF/calendar-2020.pdf'));
    // split in 6 page docs
    return pdf.slice(10, 12);
</script>

4 Joining documents

To join tow PDF documents simply use join method on source document.

Copy
<script>
    var pdf2019 = new Ax.pdf.Reader(new Ax.net.URL('https://bitbucket.org/deister/axional-docs-resources/raw/master/PDF/calendar-2019.pdf'));
    var pdf2020 = new Ax.pdf.Reader(new Ax.net.URL('https://bitbucket.org/deister/axional-docs-resources/raw/master/PDF/calendar-2020.pdf'));
    return pdf2019.join(pdf2020);
</script>

5 Removing pages

To remove pages from a document simply specify the page numbers.

Copy
<script>
    var pdf = new Ax.pdf.Reader(new Ax.net.URL('https://bitbucket.org/deister/axional-docs-resources/raw/master/PDF/calendar-2020.pdf'));
    // remove jan, march, july from calendar
    return pdf.remove(1, 3, 7);
</script>

6 Text extraction

On a PDF containing text, you can extract the text from a page by using getTextFromPage method.

Copy
<script>
    var pdf = new Ax.pdf.Reader(new Ax.net.URL('https://bitbucket.org/deister/axional-docs-resources/raw/master/PDF/sample.pdf')); 
    console.log(pdf.getTextFromPage(2))
</script>
Simple PDF File 2  

             ...continued from page 1. Yet more text. And more text. And more text.  
             And more text. And more text. And more text. And more text. And more  
             text. Oh, how boring typing this stuff. But not as boring as watching  
             paint dry. And more text. And more text. And more text. And more text.  

             Boring.  More, a little more text. The end, and just as well.

7 Byte Code extraction

On a PDF containing text, you can extract information as the text-position of a text from a page by using getBitmapFromPage method. This function return a string representing a JSON multidimensional array.

Copy
<script>
    var pdf = new Ax.pdf.Reader(new Ax.net.URL('https://bitbucket.org/deister/axional-docs-resources/raw/master/PDF/sample.pdf'));
    let str = pdf.getBitmapFromPage(2);
    console.log("========JSON DATA FROM TEXT IN PAGE ==========");
    console.log(str);
    console.log("==============================================");
    
    let arrPdfByte = eval(str);
    arrPdfByte.forEach(chunk => {
        console.log("******************************************");
        console.log("COL POSITION ON PDF DOCUMENT: " + chunk[0]);
        console.log("ROW POSITION ON PDF DOCUMENT: " + chunk[1]);
        console.log("CHAR WIDTH                  : " + chunk[2]);
        console.log("COL POSITION ON TEXT LAYOUT : " + chunk[3]);
        console.log("ROW POSITION ON TEXT LAYOUT : " + chunk[4]);
        console.log("TEXT                        : " + chunk[5]);
        console.log("WORD WIDTH                  : " + chunk[6]);
        console.log("WORD HEIGHT                 : " + chunk[7]);
    })
</script>
========JSON DATA FROM TEXT IN PAGE ==========
[
[57,722,12.239525,10,0," Simple PDF File 2 ",233,19],
[69,689,4.4004154,12,3," ...continued from page 1. Yet more text. And more text. And more text. ",317,7],
[69,677,4.5735703,12,4," And more text. And more text. And more text. And more text. And more ",320,7],
[69,665,4.2432384,12,5," text. Oh, how boring typing this stuff. But not as boring as watching ",301,7],
[69,653,4.4156933,12,6," paint dry. And more text. And more text. And more text. And more text. ",318,7],
[69,641,4.137618,12,8," Boring.  More, a little more text. The end, and just as well. ",261,7]]

==============================================
******************************************
COL POSITION ON PDF DOCUMENT: 57
ROW POSITION ON PDF DOCUMENT: 722
CHAR WIDTH                  : 12.239525
COL POSITION ON TEXT LAYOUT : 10
ROW POSITION ON TEXT LAYOUT : 0
TEXT                        :  Simple PDF File 2 
WORD WIDTH                  : 233
WORD HEIGHT                 : 19
******************************************
COL POSITION ON PDF DOCUMENT: 69
ROW POSITION ON PDF DOCUMENT: 689
CHAR WIDTH                  : 4.4004154
COL POSITION ON TEXT LAYOUT : 12
ROW POSITION ON TEXT LAYOUT : 3
TEXT                        :  ...continued from page 1. Yet more text. And more text. And more text. 
WORD WIDTH                  : 317
WORD HEIGHT                 : 7

....

8 Image preview generation

You can obtain an image preview for a PDF page using the getPreviewFromPage method.

Copy
<script>
    var pdf = new Ax.pdf.Reader(new Ax.net.URL('https://bitbucket.org/deister/axional-docs-resources/raw/master/PDF/sample.pdf'));
    for (var page = 1; page <= pdf.getNumberOfPages(); page++) {
    	var data = pdf.getPreviewFromPage(page);
    	var name = "/tmp/image" + page + ".jpg";
    	console.log("===== PAGE PREVIEW " + page + " saved to " + name);
    	console.log(data);
        new Ax.io.File(name).write(data);
    }
</script>
===== PAGE PREVIEW 1 saved to /tmp/image1.jpg
00000000 FF D8 FF E0 00 10 4A 46 49 46 00 01 02 00 00 01 ......JFIF......
00000010 00 01 00 00 FF DB 00 43 00 08 06 06 07 06 05 08 .......C........
00000020 07 07 07 09 09 08 0A 0C 14 0D 0C 0B 0B 0C 19 12 ................
00000030 13 0F 14 1D 1A 1F 1E 1D 1A 1C 1C 20 24 2E 27 20 ........... $.' 
00000040 22 2C 23 1C 1C 28 37 29 2C 30 31 34 34 34 1F 27 ",#..(7),01444.'
00000050 39 3D 38 32 3C 2E 33 34 32 FF DB 00 43 01 09 09 9=82<.342...C...
00000060 09 0C 0B 0C 18 0D 0D 18 32 21 1C 21 32 32 32 32 ........2!.!2222
00000070 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 2222222222222222
00000080 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 2222222222222222
00000090 32 32 32 32 32 32 32 32 32 32 32 32 32 32 FF C0 22222222222222..
259582 byte(s) more
===== PAGE PREVIEW 2 saved to /tmp/image2.jpg
00000000 FF D8 FF E0 00 10 4A 46 49 46 00 01 02 00 00 01 ......JFIF......
00000010 00 01 00 00 FF DB 00 43 00 08 06 06 07 06 05 08 .......C........
00000020 07 07 07 09 09 08 0A 0C 14 0D 0C 0B 0B 0C 19 12 ................
00000030 13 0F 14 1D 1A 1F 1E 1D 1A 1C 1C 20 24 2E 27 20 ........... $.' 
00000040 22 2C 23 1C 1C 28 37 29 2C 30 31 34 34 34 1F 27 ",#..(7),01444.'
00000050 39 3D 38 32 3C 2E 33 34 32 FF DB 00 43 01 09 09 9=82<.342...C...
00000060 09 0C 0B 0C 18 0D 0D 18 32 21 1C 21 32 32 32 32 ........2!.!2222
00000070 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 2222222222222222
00000080 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32 2222222222222222
00000090 32 32 32 32 32 32 32 32 32 32 32 32 32 32 FF C0 22222222222222..
214524 byte(s) more

9 Image extraction

Images contained in a PDF can be extracted. The image is returned as a PDFImage object.

You can determine if a document contains only the image layer (a scanend document with no text layer) by using isImageOnly method.

Copy
<script>
    var pdf = new Ax.pdf.Reader(new Ax.net.URL('https://bitbucket.org/deister/axional-docs-resources/raw/master/PDF/scanned.pdf')); 
    console.log(pdf.isImageOnly())
</script>
true

9.1 Getting images

To extract all images from a PDF document:

Copy
<script>
    var pdf = new Ax.pdf.Reader(new Ax.net.URL('https://bitbucket.org/deister/axional-docs-resources/raw/master/PDF/scanned.pdf')); 
    for (var image of pdf.getImages()) {
        console.log("Name    : " + image.getName());
        console.log("Type    : " + image.getType());
        console.log("FileName: " + image.getFileName());
        console.log("Lengh   : " + image.getLength());
        // Simply send image to /tmp using it's file name
        image.writeTo("/tmp");
    }
</script>
Name    : image0001
Type    : png
FileName: image0001.png
Lengh   : 30022

9.2 PdfImage

The PdfImage object contains methods to get information about image and to write it to disk or convert to blob.

Return Method Description
byte[] getBytes Returns the image bytes
String getType Returns the image type (png, jpg, gif)
String getName Returns the name for the image as given from image extractor
String getFileName Returns the name plus type
int getLength Returns the image length in bytes
void writeTo(String file) Writes image to given file name or directory (if file points to directory)
void writeTo(File file) Writes image to given file name or directory (if file points to directory)
Blob toBlob() Convert image to a java.sql.Blob ready to be used in SQL operation

10 Insert page

Inserts a new blank page into document at position passed as parameter. Position 1 represents before any page in document.

Copy
<script>
    var pdf = new Ax.pdf.Reader(new Ax.net.URL('https://bitbucket.org/deister/axional-docs-resources/raw/master/PDF/sample.pdf'));
    pdf = pdf.insertPage(1); // Now, document has 1 more page
    pdf = pdf.insertPage(3); // New page after original 1st page
    
    return pdf.toBlob();
</script>

11 Watermarks

You can add text and image watermarks in a PDF document.

Copy
<script>

    var pdf = new Ax.pdf.Reader(new Ax.net.URL('https://bitbucket.org/deister/axional-docs-resources/raw/master/PDF/sample.pdf'));
    
    var out = pdf.addWatermark(options => {
    	options.setOpacity(0.40);
    	// Apply only on specific pages ... (1,3,5,7)
    	options.setPages(1);
    	options.addText("This is center watermark").setFontSize(14).setFontColor(0, 255, 0);
    	options.addText("This is top watermark",    -1, options.getTop()    - 20, 0).setFontFamily("COURIER").setFontSize(14).setFontColor(0, 0, 255);
    	options.addText("This is bottom watermark", -1, options.getBottom() + 20, 0);
    	options.addText("This is left watermark",   options.getLeft()  + 20,  -1, 90);
    	options.addText("This is right watermark",  options.getRight() - 20,  -1, 90).setFontFamily("HELVETICA").setFontSize(14).setFontColor(255, 0, 0);
    	
    	var image = options.setImage(Ax.util.Base64.decode('iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAYAAABzenr0AAAHVUlEQVR42rWXW1DU1x3Hv/+9sOwFlsuyXMpFyWiVq6CiMqNt1bFGpZJd1Eg1wbzlpTN5y7TP7eQtM33JWySJtZk0XCzWsY6SBjJYQUHkotUJIiDsLosILLsse/n3e/67i2jQrJn2zPxnz57/Oef3Ob/b+f0lPGuappaWVkjSYTkchoyf2sRKCSpJQigUunC8rq6OA8GXzZZW9ZMIMG+rrf3Jol9sLpcLV9vbW3578uQx/g39GEDG183NLvtbb+Hh6CMqQopPyovnl2WsX1eAc+fP41R9PRwOB77p6Pi2/sSJ/WtpYrUU69+am511BHg0Nq4AvC6EEC6egvw8/OXLL3H44JtISTFjamoKnV1dUyfq6go4LfCjAGPjE1CpVC+XJMBehKNg8YTpP/l5uThPALvNBs+CB+npaYomCDF53G5ftxpiTYDxicc/BIgKVTSj1UJKSHhe/vIywoEA5FAIuTnZCkD922/Dz3Gf16doQvjEd11d0zTzz2IQawJMPJ4igPSccEmjgUqnUwSHvIvoTktD4SefKK9H3n8fVU+eQKU3IrzsR3ZyEnpu3cL9Bw8UASIuhGZO0yeaW1tBgFQOPV0DoIkANjyecihhtCJcrYbKYMDgzh3QFhQgOD8Pw8gIJicmlCk5ubnwFhZCnZyM5dFRVA8MwPAS633d0oJjNlsmu64fAjQRgHZzOJwRB4ydXG+AWq/DzeJiWKhmn9eL6g8+eOYHtH3Xxx8jkZAzNM/WoSFqyYcw58nBEF+HlWnZ2VlgpOGY3f5qAKfTFQGgH6i5af+unQhxo7nhIdg+/BAh/zLmxsbgvNWrLMzcWglzfj7UugQ0f/QRzEXFUGtUKLt+IwIRCik+mpVljQ/A5ZqOOBtVrzabMWyrRWL/HZQzSWmsmej8w+9XbBvbRPR3//FPCLqc6Kedl8rLUNTcitDcXBRAhtWaER/AtNsNiacXTicWDhw9Cs2dO9h+pAZ9jWdBo6C4MQv6qkRloa97CUMNDmaZMCoazqDnYhuCZWUovXBBOUjY74dI7xkWS3wA7pmZyOlNJnxLm1rp1TtO1mN+dAwj/7yCosZMJJbqgKWoDhIlLA34MdzgROGvDyB5XT5u/PU8XPML+AV9JuTxKFqwpKfHBzDDkIoBOD//HIPvvYejDQ2Y+FcH5kYnUTacDcxSeCAKoOU2qRLuFE3BvC4Hub/cgwuNjSj59FNkvvPOCkA6QzcugNmns3TACIDjs88UgJrjx+G82QvPiAObhiyMYvlZPtPySZFwr9gNU2EWMrdVou2rrxSArHffVQAQDiE1JTU+gKdz85EIIEA7E5KFyafqwAEs0jlnev+D/LMmaDdrAH9UAzoJgbtBjJ1h2q38OYx0tu4rV+BmFtwblqMAYaSYk+MDmKftRIwrTsgZfYePQOrswPaqKrhvD7Jy0CDrzzpoKiLpOtgXhuN3fnaCsGwpQU93N+Tde1Dxj4uQyCicUMRhMn0pLoAFXiDKS7UGGkMibtvtkC5dQlFpKXQGI2YH7kIOSEwy0TCkMiStjNTSzfAzTQ8zE8qHDmFLUxOC3iXaPzIxKckUH4DHsxgZURKRHje2boXM/tPbt7Fv82Zo6aChAINu0ReZZtRDrdUgQEe7dvcuUrZsgUSV7+B9IDKiUL9oJpMxPoDFRe/KPaBiGGrE5lTvJfbzkpKwzH7F+vXPJaK+hw+RQNOMLyzgEENPy35QQLKvpEE2o9EQH4DX51vZWaUhADf79759cLe3w8pY9jBPGBITMb20pEzLYN/LvonvXHxn2bsXO69do0sQIBhYSZkGvT4+AF90Y8UEjIAn16/jm+pq2OgLc8yIi7zXB5liN5aXK9Pu9/ejhCnbaLXCzAzYTNv/qqsLabt2IcRIiJlAT9C4AJZF6hTyhfoJ0U7nM/HqLd+2De7eXiW9plLQ8P37ysKijRsxSzCRti2Vlei/eRMeXtF76YxBChdmEIISGFVxAQREZUMhaqr+SU8PrjL8anbvhn9yEgHGtJpg5k2b0Hr1qrKwdv9+zN27R8ek7Zk7dDk5aOvsxH6GY9r27bxJgwxHGVquiwtA2E7UAWoOttEEFvYrSkrgJYCS+jMy8GB8HEFWT6JpWGRsyMvD0vR0xNYE6BschJv71NAEoh6X2Re+FBdAiGoTNeH4uXPoOn0aB4uKoKVZxAnFLZlAe1+m3fdQ7aJ10BwH6Q/L4urlWqGhANV9eXgY1V98gbxTp5SSTM21cQHI0Rd/pxms/K3csEF85ayMT/Ck31PVB6MaucwTv0HT5FIzsTlq5ope1oRCym9EuR4dfzVAtCiNte6aGjguXoS49VfHvMgSFWfPooA3pGiPePP1nTmj1IGr54lYyjpyBFVtbSt7vrImbGptddpYfPw/WzOLFHtt7doALJm/ZyiZwtHMJeyNtb6Ooh8hcmxe7EPlJXPlaB5QRb4rPPz+fGMtgBQ+6/mkvjD+v2yCmMUGHmKN7wJRVhj5JLz+vq/VmBohbjulnPkvzMTLTpQMST8AAAAASUVORK5CYII='));
    	image.scaleToFit(512, 512);
    });
    
    new Ax.io.File("/tmp/watermark.pdf").write(out);

</script>

The sample PDF with watermarks applied only on first page