Leptonica
1.54
|
#include "allheaders.h"
Input: baas first (use 0 to select from the beginning) last (use 0 to select to the end) copyflag (L_COPY, L_CLONE) Return: baad, or null on error
Notes: (1) The copyflag specifies what we do with each boxa from baas. Specifically, L_CLONE inserts a clone into baad of each selected boxa from baas.
l_int32 boxaaSizeRange | ( | BOXAA * | baa, |
l_int32 * | pminw, | ||
l_int32 * | pminh, | ||
l_int32 * | pmaxw, | ||
l_int32 * | pmaxh | ||
) |
Input: baa &minw, &minh, &maxw, &maxh (<optional return>=""> range of dimensions of all boxes) Return: 0 if OK, 1 on error
BOXA* boxaConstrainSize | ( | BOXA * | boxas, |
l_int32 | width, | ||
l_int32 | widthflag, | ||
l_int32 | height, | ||
l_int32 | heightflag | ||
) |
Input: boxas width (force width of all boxes to this size; input 0 to use the median width) widthflag (L_ADJUST_SKIP, L_ADJUST_LEFT, L_ADJUST_RIGHT, or L_ADJUST_LEFT_AND_RIGHT) height (force height of all boxes to this size; input 0 to use the median height) heightflag (L_ADJUST_SKIP, L_ADJUST_TOP, L_ADJUST_BOT, or L_ADJUST_TOP_AND_BOT) Return: boxad (adjusted so all boxes are the same size)
Notes: (1) Forces either width or height (or both) of every box in the boxa to a specified size, by moving the indicated sides. (2) All input boxes should be valid. Median values will be used with invalid boxes. (3) Typical input might be the output of boxaLinearFit(), where each side has been fit. (4) Unlike boxaAdjustWidthToTarget() and boxaAdjustHeightToTarget(), this is not dependent on a difference threshold to change the size.
PTA* boxaConvertToPta | ( | BOXA * | boxa, |
l_int32 | ncorners | ||
) |
Input: boxa ncorners (2 or 4 for the representation of each box) Return: pta (with points for each box in the boxa), or null on error
Notes: (1) If ncorners == 2, we select the UL and LR corners. Otherwise we save all 4 corners in this order: UL, UR, LL, LR.
PIX* boxaDisplayTiled | ( | BOXA * | boxas, |
PIXA * | pixa, | ||
l_int32 | maxwidth, | ||
l_int32 | linewidth, | ||
l_float32 | scalefactor, | ||
l_int32 | background, | ||
l_int32 | spacing, | ||
l_int32 | border, | ||
const char * | fontdir | ||
) |
Input: boxa pixa (<optional> background for each box) maxwidth (of output image) linewidth (width of box outlines, before scaling) scalefactor (applied to every box; use 1.0 for no scaling) background (0 for white, 1 for black; this is the color of the spacing between the images) spacing (between images, and on outside) border (width of black border added to each image; use 0 for no border) fontdir (<optional> can be NULL; use to number the boxes) Return: pixd (of tiled images of boxes), or null on error
Notes: (1) Displays each box separately in a tiled 32 bpp image. (2) If pixa is defined, it must have the same count as the boxa, and it will be a background over with each box is rendered. If pixa is not defined, the boxes will be rendered over blank images of identical size. (3) See pixaDisplayTiledInRows() for other parameters.
static l_int32 boxaFillAll | ( | BOXA * | boxa | ) | [static] |
Input: boxa Return: 0 if OK, 1 on error
Notes: (1) This static function replaces every invalid box with the nearest valid box. If there are no valid boxes, it issues a warning.
BOXA* boxaFillSequence | ( | BOXA * | boxas, |
l_int32 | useflag, | ||
l_int32 | debug | ||
) |
Input: boxas (with at least 3 boxes) useflag (L_USE_ALL_BOXES, L_USE_SAME_PARITY_BOXES) debug (1 for debug output) Return: boxad (filled boxa), or null on error
Notes: (1) This simple function replaces invalid boxes with a copy of the nearest valid box, selected from either the entire sequence (L_USE_ALL_BOXES) or from the boxes with the same parity (L_USE_SAME_PARITY_BOXES). It returns a new boxa. (2) This is useful if you expect boxes in the sequence to vary slowly with index.
l_int32 boxaGetArea | ( | BOXA * | boxa, |
l_int32 * | parea | ||
) |
Input: boxa &area (<return> total area of all boxes) Return: 0 if OK, 1 on error
Notes: (1) Measures the total area of the boxes, without regard to overlaps.
l_int32 boxaGetCoverage | ( | BOXA * | boxa, |
l_int32 | wc, | ||
l_int32 | hc, | ||
l_int32 | exactflag, | ||
l_float32 * | pfract | ||
) |
Input: boxa wc, hc (dimensions of overall clipping rectangle with UL corner at (0, 0) that is covered by the boxes. exactflag (1 for guaranteeing an exact result; 0 for getting an exact result only if the boxes do not overlap) &fract (<return> sum of box area as fraction of w * h) Return: 0 if OK, 1 on error
Notes: (1) The boxes in boxa are clipped to the input rectangle. (2) * When == 1, we generate a 1 bpp pix of size wc x hc, paint all the boxes black, and count the fg pixels. This can take 1 msec on a large page with many boxes. * When == 0, we clip each box to the wc x hc region and sum the resulting areas. This is faster. * The results are the same when none of the boxes overlap within the wc x hc region.
Input: boxa &w (<optional return>=""> width) &h (<optional return>=""> height) &box (<optional return>="">, minimum box containing all boxes in boxa) Return: 0 if OK, 1 on error
Notes: (1) The returned w and h are the minimum size image that would contain all boxes untranslated. (2) If there are no valid boxes, returned w and h are 0 and all parameters in the returned box are 0. This is not an error, because an empty boxa is valid and boxaGetExtent() is required for serialization.
BOXA* boxaLinearFit | ( | BOXA * | boxas, |
l_float32 | factor, | ||
l_int32 | debug | ||
) |
Input: boxas (source boxa) factor (reject outliers with widths and heights deviating from the median by more than times the median deviation from the median; typically ~3) debug (1 for debug output) Return: boxad (fitted boxa), or null on error
Notes: (1) This finds a set of boxes (boxad) where each edge of each box is a linear least square fit (LSF) to the edges of the input set of boxes (boxas). Before fitting, outliers in the boxes in boxas are removed (see below). (2) This is useful when each of the box edges in boxas are expected to vary linearly with box index in the set. These could be, for example, noisy measurements of similar regions on successive scanned pages. (3) Method: there are 2 steps: (a) Find and remove outliers, separately based on the deviation from the median of the width and height of the box. Use to specify tolerance to outliers; use a very large value of to avoid rejecting any box sides in the linear LSF. (b) On the remaining boxes, do a linear LSF independently for each of the four sides. (4) Invalid input boxes are not used in computation of the LSF. (5) The returned boxad can then be used in boxaModifyWithBoxa() to selectively change the boxes in boxas.
l_int32 boxaLocationRange | ( | BOXA * | boxa, |
l_int32 * | pminx, | ||
l_int32 * | pminy, | ||
l_int32 * | pmaxx, | ||
l_int32 * | pmaxy | ||
) |
Input: boxa &minx, &miny, &maxx, &maxy (<optional return>=""> range of UL corner positions) Return: 0 if OK, 1 on error
NUMA* boxaMakeAreaIndicator | ( | BOXA * | boxa, |
l_int32 | area, | ||
l_int32 | relation | ||
) |
Input: boxa area (threshold value of width * height) relation (L_SELECT_IF_LT, L_SELECT_IF_GT, L_SELECT_IF_LTE, L_SELECT_IF_GTE) Return: na (indicator array), or null on error
Notes: (1) To keep small components, use relation = L_SELECT_IF_LT or L_SELECT_IF_LTE. To keep large components, use relation = L_SELECT_IF_GT or L_SELECT_IF_GTE.
NUMA* boxaMakeSizeIndicator | ( | BOXA * | boxa, |
l_int32 | width, | ||
l_int32 | height, | ||
l_int32 | type, | ||
l_int32 | relation | ||
) |
Input: boxa width, height (threshold dimensions) type (L_SELECT_WIDTH, L_SELECT_HEIGHT, L_SELECT_IF_EITHER, L_SELECT_IF_BOTH) relation (L_SELECT_IF_LT, L_SELECT_IF_GT, L_SELECT_IF_LTE, L_SELECT_IF_GTE) Return: na (indicator array), or null on error
Notes: (1) The args specify constraints on the size of the components that are kept. (2) If the selection type is L_SELECT_WIDTH, the input height is ignored, and v.v. (3) To keep small components, use relation = L_SELECT_IF_LT or L_SELECT_IF_LTE. To keep large components, use relation = L_SELECT_IF_GT or L_SELECT_IF_GTE.
Input: boxas boxam (boxa with boxes used to modify those in boxas) subflag (L_USE_MINSIZE, L_USE_MAXSIZE, L_SUB_ON_BIG_DIFF, L_USE_CAPPED_MIN or L_USE_CAPPED_MAX) maxdiff (parameter used with L_SUB_ON_BIG_DIFF, L_USE_CAPPED_MIN and L_USE_CAPPED_MAX) Return: boxad (result after adjusting boxes in boxas), or null on error.
Notes: (1) This takes two input boxa (boxas, boxam) and constructs boxad, where each box in boxad is generated from the corresponding boxes in boxas and boxam. The rule for constructing each output box depends on and . Let boxs be a box from and boxm be a box from . If == L_USE_MINSIZE, the output box is the intersection of the two input boxes. If == L_USE_MAXSIZE, the output box is the union of the two input boxes; i.e., the minimum bounding rectangle for the two input boxes. For the last two flags, each side of the output box is found separately from the corresponding side of boxs and boxm, according to these rules, where "smaller"("bigger") mean in a direction that decreases(increases) the size of the output box: If == L_SUB_ON_BIG_DIFF, use boxs if within pixels of boxm; otherwise, use boxm. If == L_USE_CAPPED_MIN, use the Min of boxm with the Max of (boxs, boxm +- ), where the sign is adjusted to make the box smaller (e.g., use "+" on left side). If == L_USE_CAPPED_MAX, use the Max of boxm with the Min of (boxs, boxm +- ), where the sign is adjusted to make the box bigger (e.g., use "-" on left side). Use of the last 2 flags is further explained in (3) and (4). (2) boxas and boxam must be the same size. If boxam == NULL, this returns a copy of boxas with a warning. (3) If == L_SUB_ON_BIG_DIFF, use boxm for each side where the corresponding sides differ by more than . Two extreme cases: (a) set == 0 to use only values from boxam in boxad. (b) set == 10000 to ignore all values from boxam; then boxad will be the same as boxas. (4) If == L_USE_CAPPED_MAX: use boxm if boxs is smaller; use boxs if boxs is bigger than boxm by an amount up to ; and use boxm +- (the 'capped' value) if boxs is bigger than boxm by an amount larger than . Similarly, with interchange of Min/Max and sign of , for == L_USE_CAPPED_MIN. (5) If either of corresponding boxes in boxas and boxam is invalid, an invalid box is copied to the result. (6) Typical input for boxam may be the output of boxaLinearFit(). where outliers have been removed and each side is LS fit to a line. (7) Unlike boxaAdjustWidthToTarget() and boxaAdjustHeightToTarget(), this is not dependent on a difference threshold to change the size. Additional constraints on the size of each box can be enforced by following this operation with boxaConstrainSize(), taking boxad as input.
BOXA* boxaPermutePseudorandom | ( | BOXA * | boxas | ) |
Input: boxas (input boxa) Return: boxad (with boxes permuted), or null on error
Notes: (1) This does a pseudorandom in-place permutation of the boxes. (2) The result is guaranteed not to have any boxes in their original position, but it is not very random. If you need randomness, use boxaPermuteRandom().
BOXA* boxaPermuteRandom | ( | BOXA * | boxad, |
BOXA * | boxas | ||
) |
Input: boxad (<optional> can be null or equal to boxas) boxas (input boxa) Return: boxad (with boxes permuted), or null on error
Notes: (1) If boxad is null, make a copy of boxas and permute the copy. Otherwise, boxad must be equal to boxas, and the operation is done in-place. (2) This does a random in-place permutation of the boxes, by swapping each box in turn with a random box. The result is almost guaranteed not to have any boxes in their original position. (3) MSVC rand() has MAX_RAND = 2^15 - 1, so it will not do a proper permutation is the number of boxes exceeds this.
l_int32 boxaPlotSides | ( | BOXA * | boxa, |
const char * | plotname, | ||
NUMA ** | pnal, | ||
NUMA ** | pnat, | ||
NUMA ** | pnar, | ||
NUMA ** | pnab, | ||
l_int32 | outformat | ||
) |
Input: boxas (source boxa) plotname (<optional>, can be NULL) &nal (<optional return>=""> na of left sides) &nat (<optional return>=""> na of top sides) &nar (<optional return>=""> na of right sides) &nab (<optional return>=""> na of bottom sides) outformat (GPLOT_NONE for no output; GPLOT_PNG for png, etc) ut Return: 0 if OK, 1 on error
Notes: (1) This is a debugging function to show the progression of the four sides in the boxes. There must be at least 2 boxes. (2) If there are invalid boxes (e.g., if only even or odd indices have valid boxes), this will fill them with the nearest valid box before plotting. (3) The plotfiles are put in /tmp/plotsides, and are named either with or, if NULL, a default name.
BOXA* boxaReconcileEvenOddHeight | ( | BOXA * | boxas, |
l_int32 | sides, | ||
l_int32 | delh, | ||
l_int32 | op, | ||
l_float32 | factor | ||
) |
Input: boxas (containing at least 3 valid boxes in even and odd) sides (L_ADJUST_TOP, L_ADJUST_BOT, L_ADJUST_TOP_AND_BOT) delh (threshold on median height difference) op (L_ADJUST_CHOOSE_MIN, L_ADJUST_CHOOSE_MAX) factor (> 0.0, typically near 1.0) Return: boxad (adjusted), or a copy of boxas on error
Notes: (1) The basic idea is to reconcile differences in box height in the even and odd boxes, by moving the top and/or bottom edges in the even and odd boxes. Choose the edge or edges to be moved, whether to adjust the boxes with the min or the max of the medians, and the threshold on the median difference between even and odd box heights for the operations to take place. The same threshold is also used to determine if each individual box edge is to be adjusted. (2) Boxes are conditionally reset with either the same top (y) value or the same bottom value, or both. The value is determined by the greater or lesser of the medians of the even and odd boxes, with the choice depending on the value of , which selects for either min or max median height. If the median difference between even and odd boxes is greater than , then any individual box edge that differs from the selected median by more than is set to the selected median times a factor typically near 1.0. (3) Note that if selecting for minimum height, you will choose the largest y-value for the top and the smallest y-value for the bottom of the box. (4) Typical input might be the output of boxaSmoothSequence(), where even and odd boxa have been independently regulated. (5) Require at least 3 valid even boxes and 3 valid odd boxes. Median values will be used for invalid boxes.
BOXA* boxaReconcilePairWidth | ( | BOXA * | boxas, |
l_int32 | delw, | ||
l_int32 | op, | ||
l_float32 | factor, | ||
NUMA * | na | ||
) |
Input: boxas delw (threshold on adjacent width difference) op (L_ADJUST_CHOOSE_MIN, L_ADJUST_CHOOSE_MAX) factor (> 0.0, typically near 1.0) na (<optional> indicator array allowing change) Return: boxad (adjusted), or a copy of boxas on error
Notes: (1) This reconciles differences in the width of adjacent boxes, by moving one side of one of the boxes in each pair. If the widths in the pair differ by more than some threshold, move either the left side for even boxes or the right side for odd boxes, depending on if we're choosing the min or max. If choosing min, the width of the max is set to factor * (width of min). If choosing max, the width of the min is set to factor * (width of max). (2) If exists, it is an indicator array corresponding to the boxes in . If != NULL, only boxes with an indicator value of 1 are allowed to adjust; otherwise, all boxes can adjust. (3) Typical input might be the output of boxaSmoothSequence(), where even and odd boxa have been independently regulated.
Input: boxas area (threshold value of width * height) relation (L_SELECT_IF_LT, L_SELECT_IF_GT, L_SELECT_IF_LTE, L_SELECT_IF_GTE) &changed (<optional return>=""> 1 if changed; 0 if clone returned) Return: boxad (filtered set), or null on error
Notes: (1) Uses box clones in the new boxa. (2) To keep small components, use relation = L_SELECT_IF_LT or L_SELECT_IF_LTE. To keep large components, use relation = L_SELECT_IF_GT or L_SELECT_IF_GTE.
BOXA* boxaSelectBySize | ( | BOXA * | boxas, |
l_int32 | width, | ||
l_int32 | height, | ||
l_int32 | type, | ||
l_int32 | relation, | ||
l_int32 * | pchanged | ||
) |
Input: boxas width, height (threshold dimensions) type (L_SELECT_WIDTH, L_SELECT_HEIGHT, L_SELECT_IF_EITHER, L_SELECT_IF_BOTH) relation (L_SELECT_IF_LT, L_SELECT_IF_GT, L_SELECT_IF_LTE, L_SELECT_IF_GTE) &changed (<optional return>=""> 1 if changed; 0 if clone returned) Return: boxad (filtered set), or null on error
Notes: (1) The args specify constraints on the size of the components that are kept. (2) Uses box clones in the new boxa. (3) If the selection type is L_SELECT_WIDTH, the input height is ignored, and v.v. (4) To keep small components, use relation = L_SELECT_IF_LT or L_SELECT_IF_LTE. To keep large components, use relation = L_SELECT_IF_GT or L_SELECT_IF_GTE.
Input: boxas first (use 0 to select from the beginning) last (use 0 to select to the end) copyflag (L_COPY, L_CLONE) Return: boxad, or null on error
Notes: (1) The copyflag specifies what we do with each box from boxas. Specifically, L_CLONE inserts a clone into boxad of each selected box from boxas.
BOXA* boxaSelectWithIndicator | ( | BOXA * | boxas, |
NUMA * | na, | ||
l_int32 * | pchanged | ||
) |
Input: boxas na (indicator numa) &changed (<optional return>=""> 1 if changed; 0 if clone returned) Return: boxad, or null on error
Notes: (1) Returns a boxa clone if no components are removed. (2) Uses box clones in the new boxa. (3) The indicator numa has values 0 (ignore) and 1 (accept).
l_int32 boxaSizeRange | ( | BOXA * | boxa, |
l_int32 * | pminw, | ||
l_int32 * | pminh, | ||
l_int32 * | pmaxw, | ||
l_int32 * | pmaxh | ||
) |
Input: boxa &minw, &minh, &maxw, &maxh (<optional return>=""> range of dimensions of box in the array) Return: 0 if OK, 1 on error
BOXA* boxaSmoothSequenceLS | ( | BOXA * | boxas, |
l_float32 | factor, | ||
l_int32 | subflag, | ||
l_int32 | maxdiff, | ||
l_int32 | debug | ||
) |
Input: boxas (source boxa) factor (reject outliers with widths and heights deviating from the median by more than times the median variation from the median; typically ~3) subflag (L_USE_MINSIZE, L_USE_MAXSIZE, L_SUB_ON_BIG_DIFF, L_USE_CAPPED_MIN or L_USE_CAPPED_MAX) maxdiff (parameter used with L_SUB_ON_BIG_DIFF and L_USE_CAPPED_MAX) debug (1 for debug output) Return: boxad (fitted boxa), or null on error
Notes: (1) This returns a modified version of by constructing for each input box a box that has been linear least square fit (LSF) to the entire set. The linear fitting is done to each of the box sides independently, after outliers are rejected, and it is computed separately for sequences of even and odd boxes. Once the linear LSF box is found, the output box (in ) is constructed from the input box and the LSF box, depending on . See boxaModifyWithBoxa() for details on the use of and . (2) This is useful if, in both the even and odd sets, the box edges vary roughly linearly with its index in the set.
BOXA* boxaSmoothSequenceMedian | ( | BOXA * | boxas, |
l_int32 | halfwin, | ||
l_int32 | subflag, | ||
l_int32 | maxdiff, | ||
l_int32 | debug | ||
) |
Input: boxas (source boxa) halfwin (half-width of sliding window; used to find median) subflag (L_USE_MINSIZE, L_USE_MAXSIZE, L_SUB_ON_BIG_DIFF, L_USE_CAPPED_MIN or L_USE_CAPPED_MAX) maxdiff (parameter used with L_SUB_ON_BIG_DIFF, L_USE_CAPPED_MIN and L_USE_CAPPED_MAX) debug (1 for debug output) Return: boxad (fitted boxa), or null on error
Notes: (1) The target width of the sliding window is 2 * + 1. If necessary, this will be reduced by boxaWindowedMedian(). (2) This returns a modified version of by constructing for each input box a box that has been smoothed with windowed median filtering. The filtering is done to each of the box sides independently, and it is computed separately for sequences of even and odd boxes. The output is constructed from the input box and the filtered boxa, box, depending on . See boxaModifyWithBoxa() for details on the use of and . (3) This is useful for removing noise separately in the even and odd sets, where the box edge locations can have discontinuities but otherwise vary roughly linearly within intervals of size or larger. (4) If you don't need to handle even and odd sets separately, just do this: boxam = boxaWindowedMedian(boxas, halfwin, debug); boxad = boxaModifyWithBoxa(boxas, boxam, subflag, maxdiff); boxaDestroy(&boxam);
l_int32 boxaSwapBoxes | ( | BOXA * | boxa, |
l_int32 | i, | ||
l_int32 | j | ||
) |
Input: boxa i, j (two indices of boxes, that are to be swapped) Return: 0 if OK, 1 on error
BOXA* boxaWindowedMedian | ( | BOXA * | boxas, |
l_int32 | halfwin, | ||
l_int32 | debug | ||
) |
Input: boxas (source boxa) halfwin (half width of window over which the median is found) debug (1 for debug output) Return: boxad (smoothed boxa), or null on error
Notes: (1) This finds a set of boxes (boxad) where each edge of each box is a windowed median smoothed value to the edges of the input set of boxes (boxas). (2) Invalid input boxes are filled from nearby ones. (3) The returned boxad can then be used in boxaModifyWithBoxa() to selectively change the boxes in the source boxa.
PTA* boxConvertToPta | ( | BOX * | box, |
l_int32 | ncorners | ||
) |
Input: box ncorners (2 or 4 for the representation of the box) Return: pta (with points), or null on error
Notes: (1) If ncorners == 2, we select the UL and LR corners. Otherwise we save all 4 corners in this order: UL, UR, LL, LR.
BOX* ptaConvertToBox | ( | PTA * | pta | ) |
Input: pta Return: box (minimum containing all points in the pta), or null on error
Notes: (1) For 2 corners, the order of the 2 points is UL, LR. For 4 corners, the order of points is UL, UR, LL, LR.
BOXA* ptaConvertToBoxa | ( | PTA * | pta, |
l_int32 | ncorners | ||
) |
Input: pta ncorners (2 or 4 for the representation of each box) Return: boxa (with one box for each 2 or 4 points in the pta), or null on error
Notes: (1) For 2 corners, the order of the 2 points is UL, LR. For 4 corners, the order of points is UL, UR, LL, LR. (2) Each derived box is the minimum size containing all corners.