Pod::Text

(Convert POD data to formatted text)

SYNOPSIS

    use Pod::Text;
    my $parser = Pod::Text->new (sentence => 1, width => 78);

    # Read POD from STDIN and write to STDOUT.
    $parser->parse_from_filehandle;

    # Read POD from file.pod and write to file.txt.
    $parser->parse_from_file ('file.pod', 'file.txt');

DESCRIPTION

Pod::Text is a module that can convert documentation in the POD format (the preferred language for documenting Perl) into formatted text. It uses no special formatting controls or codes, and its output is therefore suitable for nearly any device.

Encoding

Pod::Text uses the following logic to choose an output encoding, in order:

  1. If a PerlIO encoding layer is set on the output file handle, do not do any output encoding and will instead rely on the PerlIO encoding layer.

  2. If the encoding or utf8 options are set, use the output encoding specified by those options.

  3. If the input encoding of the POD source file was explicitly specified (using =encoding) or automatically detected by Pod::Simple, use that as the output encoding as well.

  4. Otherwise, if running on a non-EBCDIC system, use UTF-8 as the output encoding. Since this is a superset of ASCII, this will result in ASCII output unless the POD input contains non-ASCII characters without declaring or autodetecting an encoding (usually via E<> escapes).

  5. Otherwise, for EBCDIC systems, output without doing any encoding and hope this works.

One caveat: Pod::Text has to commit to an output encoding the first time it outputs a non-ASCII character, and then has to stick with it for consistency. However, =encoding commands don't have to be at the beginning of a POD document. If someone uses a non-ASCII character early in a document with an escape, such as E<0xEF>, and then puts =encoding iso-8859-1 later, ideally Pod::Text would follow rule 3 and output the entire document as ISO 8859-1. Instead, it will commit to UTF-8 following rule 4 as soon as it sees that escape, and then stick with that encoding for the rest of the document.

Unfortunately, there's no universally good choice for an output encoding. Each choice will be incorrect in some circumstances. This approach was chosen primarily for backwards compatibility. Callers should consider forcing the output encoding via encoding if they have any knowledge about what encoding the user may expect.

In particular, consider importing the Encode::Locale module, if available, and setting encoding to locale to use an output encoding appropriate to the user's locale. But be aware that if the user is not using locales or is using a locale of C, Encode::Locale will set the output encoding to US-ASCII. This will cause all non-ASCII characters will be replaced with ? and produce a flurry of warnings about unsupported characters, which may or may not be what you want.

CLASS METHODS

new(ARGS)

Create a new Pod::Text object. ARGS should be a list of key/value pairs, where the keys are chosen from the following. Each option is annotated with the version of Pod::Text in which that option was added with its current meaning.

alt

[2.00] If set to a true value, selects an alternate output format that, among other things, uses a different heading style and marks =item entries with a colon in the left margin. Defaults to false.

code

[2.13] If set to a true value, the non-POD parts of the input file will be included in the output. Useful for viewing code documented with POD blocks with the POD rendered and the code left intact.

encoding

[5.00] Specifies the encoding of the output. The value must be an encoding recognized by the Encode module (see Encode::Supported). If the output contains characters that cannot be represented in this encoding, that is an error that will be reported as configured by the errors option. If error handling is other than die, the unrepresentable character will be replaced with the Encode substitution character (normally ?).

If the output file handle has a PerlIO encoding layer set, this parameter will be ignored and no encoding will be done by Pod::Man. It will instead rely on the encoding layer to make whatever output encoding transformations are desired.

WARNING: The input encoding of the POD source is independent from the output encoding, and setting this option does not affect the interpretation of the POD input. Unless your POD source is US-ASCII, its encoding should be declared with the =encoding command in the source, as near to the top of the file as possible. If this is not done, Pod::Simple will will attempt to guess the encoding and may be successful if it's Latin-1 or UTF-8, but it will produce warnings. See perlpod(1) for more information.

errors

[3.17] How to report errors. die says to throw an exception on any POD formatting error. stderr says to report errors on standard error, but not to throw an exception. pod says to include a POD ERRORS section in the resulting documentation summarizing the errors. none ignores POD errors entirely, as much as possible.

The default is pod.

guesswork

[5.01] By default, Pod::Text applies some default formatting rules based on guesswork and regular expressions that are intended to make writing Perl documentation easier and require less explicit markup. These rules may not always be appropriate, particularly for documentation that isn't about Perl. This option allows turning all or some of it off.

The special value all enables all guesswork. This is also the default for backward compatibility reasons. The special value none disables all guesswork. Otherwise, the value of this option should be a comma-separated list of one or more of the following keywords:

quoting

If no guesswork is enabled, any text enclosed in C<> is surrounded by double quotes in nroff (terminal) output unless the contents are already quoted. When this guesswork is enabled, quote marks will also be suppressed for Perl variables, function names, function calls, numbers, and hex constants.

Any unknown guesswork name is silently ignored (for potential future compatibility), so be careful about spelling.

indent

[2.00] The number of spaces to indent regular text, and the default indentation for =over blocks. Defaults to 4.

loose

[2.00] If set to a true value, a blank line is printed after a =head1 heading. If set to false (the default), no blank line is printed after =head1, although one is still printed after =head2. This is the default because it's the expected formatting for manual pages; if you're formatting arbitrary text documents, setting this to true may result in more pleasing output.

margin

[2.21] The width of the left margin in spaces. Defaults to 0. This is the margin for all text, including headings, not the amount by which regular text is indented; for the latter, see the indent option. To set the right margin, see the width option.

nourls

[3.17] Normally, L<> formatting codes with a URL but anchor text are formatted to show both the anchor text and the URL. In other words:

    L<foo|http://example.com/>

is formatted as:

    foo <http://example.com/>

This option, if set to a true value, suppresses the URL when anchor text is given, so this example would be formatted as just foo. This can produce less cluttered output in cases where the URLs are not particularly important.

quotes

[4.00] Sets the quote marks used to surround C<> text. If the value is a single character, it is used as both the left and right quote. Otherwise, it is split in half, and the first half of the string is used as the left quote and the second is used as the right quote.

This may also be set to the special value none, in which case no quote marks are added around C<> text.

sentence

[3.00] If set to a true value, Pod::Text will assume that each sentence ends in two spaces, and will try to preserve that spacing. If set to false, all consecutive whitespace in non-verbatim paragraphs is compressed into a single space. Defaults to false.

stderr

[3.10] Send error messages about invalid POD to standard error instead of appending a POD ERRORS section to the generated output. This is equivalent to setting errors to stderr if errors is not already set. It is supported for backward compatibility.

utf8

[3.12] If this option is set to a true value, the output encoding is set to UTF-8. This is equivalent to setting encoding to UTF-8 if encoding is not already set. It is supported for backward compatibility.

width

[2.00] The column at which to wrap text on the right-hand side. Defaults to 76.

INSTANCE METHODS

As a derived class from Pod::Simple, Pod::Text supports the same methods and interfaces. See Pod::Simple for all the details. This section summarizes the most-frequently-used methods and the ones added by Pod::Text.

output_fh(FH)

Direct the output from parse_file(), parse_lines(), or parse_string_document() to the file handle FH instead of STDOUT.

output_string(REF)

Direct the output from parse_file(), parse_lines(), or parse_string_document() to the scalar variable pointed to by REF, rather than STDOUT. For example:

    my $man = Pod::Man->new();
    my $output;
    $man->output_string(\$output);
    $man->parse_file('/some/input/file');

Be aware that the output in that variable will already be encoded (see Encoding).

parse_file(PATH)

Read the POD source from PATH and format it. By default, the output is sent to STDOUT, but this can be changed with the output_fh() or output_string() methods.

parse_from_file(INPUT, OUTPUT)
parse_from_filehandle(FH, OUTPUT)

Read the POD source from INPUT, format it, and output the results to OUTPUT.

parse_from_filehandle() is provided for backward compatibility with older versions of Pod::Man. parse_from_file() should be used instead.

parse_lines(LINES[, ...[, undef]])

Parse the provided lines as POD source, writing the output to either STDOUT or the file handle set with the output_fh() or output_string() methods. This method can be called repeatedly to provide more input lines. An explicit undef should be passed to indicate the end of input.

This method expects raw bytes, not decoded characters.

parse_string_document(INPUT)

Parse the provided scalar variable as POD source, writing the output to either STDOUT or the file handle set with the output_fh() or output_string() methods.

This method expects raw bytes, not decoded characters.

FUNCTIONS

Pod::Text exports one function for backward compatibility with older versions. This function is deprecated; instead, use the object-oriented interface described above.

pod2text([[-a,] [-NNN,]] INPUT[, OUTPUT])

Convert the POD source from INPUT to text and write it to OUTPUT. If OUTPUT is not given, defaults to STDOUT. INPUT can be any expression supported as the second argument to two-argument open().

If -a is given as an initial argument, pass the alt option to the Pod::Text constructor. This enables alternative formatting.

If -NNN is given as an initial argument, pass the width option to the Pod::Text constructor with the number NNN as its argument. This sets the wrap line width to NNN.

DIAGNOSTICS

Bizarre space in item
Item called without tag

(W) Something has gone wrong in internal =item processing. These messages indicate a bug in Pod::Text; you should never see them.

Can't open %s for reading: %s

(F) Pod::Text was invoked via the compatibility mode pod2text() interface and the input file it was given could not be opened.

Invalid errors setting "%s"

(F) The errors parameter to the constructor was set to an unknown value.

Invalid quote specification "%s"

(F) The quote specification given (the quotes option to the constructor) was invalid. A quote specification must be either one character long or an even number (greater than one) characters long.

POD document had syntax errors

(F) The POD document being formatted had syntax errors and the errors option was set to die.

COMPATIBILITY

Pod::Text 2.03 (based on Pod::Parser) was the first version of this module included with Perl, in Perl 5.6.0. Earlier versions of Perl had a different Pod::Text module, with a different API.

The current API based on Pod::Simple was added in Pod::Text 3.00. Pod::Text 3.01 was included in Perl 5.9.3, the first version of Perl to incorporate those changes. This is the first version that correctly supports all modern POD syntax. The parse_from_filehandle() method was re-added for backward compatibility in Pod::Text 3.07, included in Perl 5.9.4.

Pod::Text 3.12, included in Perl 5.10.1, first implemented the current practice of attempting to match the default output encoding with the input encoding of the POD source, unless overridden by the utf8 option or (added later) the encoding option.

Support for anchor text in L<> links of type URL was added in Pod::Text 3.14, included in Perl 5.11.5.

parse_lines(), parse_string_document(), and parse_file() set a default output file handle of STDOUT if one was not already set as of Pod::Text 3.18, included in Perl 5.19.5.

Pod::Text 4.00, included in Perl 5.23.7, aligned the module version and the version of the podlators distribution. All modules included in podlators, and the podlators distribution itself, share the same version number from this point forward.

Pod::Text 4.09, included in Perl 5.25.7, fixed a serious bug on EBCDIC systems, present in all versions back to 3.00, that would cause opening brackets to disappear.

Pod::Text 5.00 and later, included in Perl 5.37.7, default, on non-EBCDIC systems, to UTF-8 encoding if it sees a non-ASCII character in the input and the input encoding is not specified. They also commit to an encoding with the first non-ASCII character and does not change the output encoding if the input encoding changes. The Encode module is now used for all output encoding rather than PerlIO layers, which fixes earlier problems with output to scalars.

CAVEATS

Line wrapping is done only at ASCII spaces and tabs, rather than using a correct Unicode-aware line wrapping algorithm.

AUTHOR

Russ Allbery <rra@cpan.org>, based very heavily on the original Pod::Text by Tom Christiansen <tchrist@mox.perl.com> and its conversion to Pod::Parser by Brad Appleton <bradapp@enteract.com>. Sean Burke's initial conversion of Pod::Man to use Pod::Simple provided much-needed guidance on how to use Pod::Simple.

COPYRIGHT AND LICENSE

Copyright 1999-2002, 2004, 2006, 2008-2009, 2012-2016, 2018-2019, 2022 Russ Allbery <rra@cpan.org>

This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

Encode::Locale, Encode::Supported, Pod::Simple, Pod::Text::Termcap, perlpod(1), pod2text(1)

The current version of this module is always available from its web site at <https://www.eyrie.org/~eagle/software/podlators/>. It is also part of the Perl core distribution as of 5.6.0.

Last spun 2024-11-17 from POD modified 2024-11-15