Obtains the text corresponding to the requested oage of a document in HOCR format.

1 hocr.getTextFromPage

    <hocr text /> +

Obtains the text from a HOCR document.

<xsql-script name='hocr.getTextFromPage'>
            <set name='m_ocr_text'><![CDATA[
            <html xmlns="http://www.w3.org/1999/xhtml">
                <div class="ocr_page" title="bbox 0 0 2548 3300; image /path/to/scanned/image.png">
                  <span class="ocr_line" title="bbox 659 143 863 177">Some Text</span>
                  <span class="ocr_line" title="bbox 723 275 916 324">More Text</span>
            </html>] ]>

                <hocr.getTextFromPage page='1'>
                    <m_ocr_text />