Leptonica
1.54
|
#include "allheaders.h"
Макросы | |
#define | DEBUG_LINES 0 |
Функции | |
l_int32 | pixGetRegionsBinary (PIX *pixs, PIX **ppixhm, PIX **ppixtm, PIX **ppixtb, l_int32 debug) |
PIX * | pixGenHalftoneMask (PIX *pixs, PIX **ppixtext, l_int32 *phtfound, l_int32 debug) |
PIX * | pixGenTextlineMask (PIX *pixs, PIX **ppixvws, l_int32 *ptlfound, l_int32 debug) |
PIX * | pixGenTextblockMask (PIX *pixs, PIX *pixvws, l_int32 debug) |
BOX * | pixFindPageForeground (PIX *pixs, l_int32 threshold, l_int32 mindist, l_int32 erasedist, l_int32 pagenum, l_int32 showmorph, l_int32 display, const char *pdfdir) |
l_int32 | pixSplitIntoCharacters (PIX *pixs, l_int32 minw, l_int32 minh, BOXA **pboxa, PIXA **ppixa, PIX **ppixdebug) |
BOXA * | pixSplitComponentWithProfile (PIX *pixs, l_int32 delta, l_int32 mindel, PIX **ppixdebug) |
PIXA * | pixExtractTextlines (PIX *pixs, l_int32 maxw, l_int32 maxh, l_int32 minw, l_int32 minh) |
l_int32 | pixDecideIfText (PIX *pixs, BOX *box, l_int32 *pistext, PIXA *pixadb) |
l_int32 | pixFindThreshFgExtent (PIX *pixs, l_int32 thresh, l_int32 *ptop, l_int32 *pbot) |
Переменные | |
static const l_int32 | MinWidth = 100 |
static const l_int32 | MinHeight = 100 |
#define DEBUG_LINES 0 |
Input: pixs (any depth) box (<optional> if null, use entire pixs) &istext (<return> 1 if text; 0 if photo; -1 if not determined) pixadb (<optional> pre-allocated, for showing intermediate computation; use null to skip) Return: 0 if OK, 1 on error
Notes: (1) It is assumed that pixs has the correct resolution set. If the resolution is 0, we set to 300 and issue a warning. (2) If necessary, the image is scaled to 300 ppi; most of the processing is done at this resolution. (3) Text is assumed to be in horizontal lines. (4) Because thin vertical lines are removed before filtering for text lines, this should identify tables as text. (5) If is null and pixs contains both text lines and line art, this function might return == true. (6) If the input pixs is empty, or for some other reason the result can not be determined, return -1. (7) For debug output, input a pre-allocated pixa.
Input: pixs (any depth, assumed to have nearly horizontal text) maxw, maxh (initial filtering: remove any components in pixs with components larger than maxw or maxh) minw, minh (final filtering: remove extracted 'lines' with sizes smaller than minw or minh) Return: pixa (of textline images, including bounding boxes), or null on error
Notes: (1) This first removes components from pixs that are either wide (> ) or tall (> ). (2) This function assumes that textlines have sufficient vertical separation and small enough skew so that a horizontal dilation sufficient to join words will not join textlines. Images with multiple columns of text may have the textlines join across the space between columns. (3) A final filtering operation removes small components, such that width < or height < . (4) For reasonable accuracy, the resolution of pixs should be at least 100 ppi. For reasonable efficiency, the resolution should not exceed 600 ppi. (5) This can be used to determine if some region of a scanned image is horizontal text. (6) As an example, for a pix with resolution 300 ppi, a reasonable set of parameters is: pixExtractTextlines(pix, 150, 150, 10, 5);
BOX* pixFindPageForeground | ( | PIX * | pixs, |
l_int32 | threshold, | ||
l_int32 | mindist, | ||
l_int32 | erasedist, | ||
l_int32 | pagenum, | ||
l_int32 | showmorph, | ||
l_int32 | display, | ||
const char * | pdfdir | ||
) |
Input: pixs (full resolution (any type or depth) threshold (for binarization; typically about 128) mindist (min distance of text from border to allow cleaning near border; at 2x reduction, this should be larger than 50; typically about 70) erasedist (when conditions are satisfied, erase anything within this distance of the edge; typically 30 at 2x reduction) pagenum (use for debugging when called repeatedly; labels debug images that are assembled into pdfdir) showmorph (set to a negative integer to show steps in generating masks; this is typically used for debugging region extraction) display (set to 1 to display mask and selected region for debugging a single page) pdfdir (subdirectory of /tmp where images showing the result are placed when called repeatedly; use null if no output requested) Return: box (region including foreground, with some pixel noise removed), or null if not found
Notes: (1) This doesn't simply crop to the fg. It attempts to remove pixel noise and junk at the edge of the image before cropping. The input is used if pixs is not 1 bpp. (2) There are several debugging options, determined by the last 4 arguments. (3) This is not intended to work on small thumbnails. The dimensions of pixs must be at least MinWidth x MinHeight. (4) If you want pdf output of results when called repeatedly, the pagenum arg labels the images written, which go into /tmp/lept/<pdfdir>/<pagenum>.png. In that case, you would clean out the /tmp directory before calling this function on each page: lept_rmdir("/lept/<pdfdir>"); lept_mkdir("/lept/<pdfdir>");
Input: pixs (1 bpp) thresh (threshold number of pixels in row) &top (<optional return>=""> location of top of region) &bot (<optional return>=""> location of bottom of region) Return: 0 if OK, 1 on error
Input: pixs (1 bpp, assumed to be 150 to 200 ppi) &pixtext (<optional return>=""> text part of pixs) &htfound (<optional return>=""> 1 if the mask is not empty) debug (flag: 1 for debug output) Return: pixd (halftone mask), or null on error
Notes: (1) This is not intended to work on small thumbnails. The dimensions of pixs must be at least MinWidth x MinHeight.
PIX* pixGenTextblockMask | ( | PIX * | pixs, |
PIX * | pixvws, | ||
l_int32 | debug | ||
) |
Input: pixs (1 bpp, textline mask, assumed to be 150 to 200 ppi) pixvws (vertical white space mask) debug (flag: 1 for debug output) Return: pixd (textblock mask), or null on error
Notes: (1) Both the input masks (textline and vertical white space) and the returned textblock mask are at the same resolution. (2) This is not intended to work on small thumbnails. The dimensions of pixs must be at least MinWidth x MinHeight. (3) The result is somewhat noisy, in that small "blocks" of text may be included. These can be removed by post-processing, using, e.g., pixSelectBySize(pix, 60, 60, 4, L_SELECT_IF_EITHER, L_SELECT_IF_GTE, NULL);
Input: pixs (1 bpp, assumed to be 150 to 200 ppi) &pixvws (<return> vertical whitespace mask) &tlfound (<optional return>=""> 1 if the mask is not empty) debug (flag: 1 for debug output) Return: pixd (textline mask), or null on error
Notes: (1) The input pixs should be deskewed. (2) pixs should have no halftone pixels. (3) This is not intended to work on small thumbnails. The dimensions of pixs must be at least MinWidth x MinHeight. (4) Both the input image and the returned textline mask are at the same resolution.
l_int32 pixGetRegionsBinary | ( | PIX * | pixs, |
PIX ** | ppixhm, | ||
PIX ** | ppixtm, | ||
PIX ** | ppixtb, | ||
l_int32 | debug | ||
) |
Input: pixs (1 bpp, assumed to be 300 to 400 ppi) &pixhm (<optional return>=""> halftone mask) &pixtm (<optional return>=""> textline mask) &pixtb (<optional return>=""> textblock mask) debug (flag: set to 1 for debug output) Return: 0 if OK, 1 on error
Notes: (1) It is best to deskew the image before segmenting. (2) The debug flag enables a number of outputs. These are included to show how to generate and save/display these results.
pixSplitComponentWithProfile()
Input: pixs (1 bpp, exactly one connected component) delta (distance used in extrema finding in a numa; typ. 10) mindel (minimum required difference between profile minimum and profile values +2 and -2 away; typ. 7) &pixdebug (<optional return>=""> debug image of splitting) Return: boxa (of c.c. after splitting), or null on error
Notes: (1) This will split the most obvious cases of touching characters. The split points it is searching for are narrow and deep minimima in the vertical pixel projection profile, after a large vertical closing has been applied to the component.
l_int32 pixSplitIntoCharacters | ( | PIX * | pixs, |
l_int32 | minw, | ||
l_int32 | minh, | ||
BOXA ** | pboxa, | ||
PIXA ** | ppixa, | ||
PIX ** | ppixdebug | ||
) |
Input: pixs (1 bpp, contains only deskewed text) minw (minimum component width for initial filtering; typ. 4) minh (minimum component height for initial filtering; typ. 4) &boxa (<optional return>=""> character bounding boxes) &pixa (<optional return>=""> character images) &pixdebug (<optional return>=""> showing splittings)
Return: 0 if OK, 1 on error
Notes: (1) This is a simple function that attempts to find split points based on vertical pixel profiles. (2) It should be given an image that has an arbitrary number of text characters. (3) The returned pixa includes the boxes from which the (possibly split) components are extracted.