perl5220delta - what is new for perl v5.22.0



NAME

perl5220delta - what is new for perl v5.22.0


DESCRIPTION

This document describes differences between the 5.20.0 release and the 5.22.0 release.

If you are upgrading from an earlier release such as 5.18.0, first read the perl5200delta manpage, which describes differences between 5.18.0 and 5.20.0.


Core Enhancements

New bitwise operators

A new experimental facility has been added that makes the four standard bitwise operators (& | ^ ~) treat their operands consistently as numbers, and introduces four new dotted operators (&. |. ^. ~.) that treat their operands consistently as strings. The same applies to the assignment variants (&= |= ^= &.= |.= ^.=).

To use this, enable the ``bitwise'' feature and disable the ``experimental::bitwise'' warnings category. See Bitwise String Operators in the perlop manpage for details. [perl #123466].

New double-diamond operator

<<>> is like <> but uses three-argument open to open each file in @ARGV. This means that each element of @ARGV will be treated as an actual file name, and "|foo" won't be treated as a pipe open.

New \b boundaries in regular expressions

qr/\b{gcb}/

gcb stands for Grapheme Cluster Boundary. It is a Unicode property that finds the boundary between sequences of characters that look like a single character to a native speaker of a language. Perl has long had the ability to deal with these through the \X regular escape sequence. Now, there is an alternative way of handling these. See \b{}, \b, \B{}, \B in the perlrebackslash manpage for details.

qr/\b{wb}/

wb stands for Word Boundary. It is a Unicode property that finds the boundary between words. This is similar to the plain \b (without braces) but is more suitable for natural language processing. It knows, for example, that apostrophes can occur in the middle of words. See \b{}, \b, \B{}, \B in the perlrebackslash manpage for details.

qr/\b{sb}/

sb stands for Sentence Boundary. It is a Unicode property to aid in parsing natural language sentences. See \b{}, \b, \B{}, \B in the perlrebackslash manpage for details.

Non-Capturing Regular Expression Flag

Regular expressions now support a /n flag that disables capturing and filling in $1, $2, etc inside of groups:

  "hello" =~ /(hi|hello)/n; # $1 is not set

This is equivalent to putting ?: at the beginning of every capturing group.

See n in the perlre manpage for more information.

use re 'strict'

This applies stricter syntax rules to regular expression patterns compiled within its scope. This will hopefully alert you to typos and other unintentional behavior that backwards-compatibility issues prevent us from reporting in normal regular expression compilations. Because the behavior of this is subject to change in future Perl releases as we gain experience, using this pragma will raise a warning of category experimental::re_strict. See 'strict' in re.

Unicode 7.0 (with correction) is now supported

For details on what is in this release, see http://www.unicode.org/versions/Unicode7.0.0/. The version of Unicode 7.0 that comes with Perl includes a correction dealing with glyph shaping in Arabic (see http://www.unicode.org/errata/#current_errata).

use locale can restrict which locale categories are affected

It is now possible to pass a parameter to use locale to specify a subset of locale categories to be locale-aware, with the remaining ones unaffected. See The ``use locale'' pragma in the perllocale manpage for details.

Perl now supports POSIX 2008 locale currency additions

On platforms that are able to handle POSIX.1-2008, the hash returned by POSIX::localeconv() includes the international currency fields added by that version of the POSIX standard. These are int_n_cs_precedes, int_n_sep_by_space, int_n_sign_posn, int_p_cs_precedes, int_p_sep_by_space, and int_p_sign_posn.

Better heuristics on older platforms for determining locale UTF-8ness

On platforms that implement neither the C99 standard nor the POSIX 2001 standard, determining if the current locale is UTF-8 or not depends on heuristics. These are improved in this release.

Aliasing via reference

Variables and subroutines can now be aliased by assigning to a reference:

    \$c = \$d;
    \&x = \&y;

Aliasing can also be accomplished by using a backslash before a foreach iterator variable; this is perhaps the most useful idiom this feature provides:

    foreach \%hash (@array_of_hash_refs) { ... }

This feature is experimental and must be enabled via use feature 'refaliasing'. It will warn unless the experimental::refaliasing warnings category is disabled.

See Assigning to References in the perlref manpage

prototype with no arguments

prototype() with no arguments now infers $_. [perl #123514].

New :const subroutine attribute

The const attribute can be applied to an anonymous subroutine. It causes the new sub to be executed immediately whenever one is created (i.e. when the sub expression is evaluated). Its value is captured and used to create a new constant subroutine that is returned. This feature is experimental. See Constant Functions in the perlsub manpage.

fileno now works on directory handles

When the relevant support is available in the operating system, the fileno builtin now works on directory handles, yielding the underlying file descriptor in the same way as for filehandles. On operating systems without such support, fileno on a directory handle continues to return the undefined value, as before, but also sets $! to indicate that the operation is not supported.

Currently, this uses either a dd_fd member in the OS DIR structure, or a dirfd(3) function as specified by POSIX.1-2008.

List form of pipe open implemented for Win32

The list form of pipe:

  open my $fh, "-|", "program", @arguments;

is now implemented on Win32. It has the same limitations as system LIST on Win32, since the Win32 API doesn't accept program arguments as a list.

Assignment to list repetition

(...) x ... can now be used within a list that is assigned to, as long as the left-hand side is a valid lvalue. This allows (undef,undef,$foo) = that_function() to be written as ((undef)x2, $foo) = that_function().

Infinity and NaN (not-a-number) handling improved

Floating point values are able to hold the special values infinity, negative infinity, and NaN (not-a-number). Now we more robustly recognize and propagate the value in computations, and on output normalize them to the strings Inf, -Inf, and NaN.

See also the the POSIX manpage enhancements.

Floating point parsing has been improved

Parsing and printing of floating point values has been improved.

As a completely new feature, hexadecimal floating point literals (like 0x1.23p-4) are now supported, and they can be output with printf "%a". See Scalar value constructors in the perldata manpage for more details.

Packing infinity or not-a-number into a character is now fatal

Before, when trying to pack infinity or not-a-number into a (signed) character, Perl would warn, and assumed you tried to pack 0xFF; if you gave it as an argument to chr, U+FFFD was returned.

But now, all such actions (pack, chr, and print '%c') result in a fatal error.

Experimental C Backtrace API

Perl now supports (via a C level API) retrieving the C level backtrace (similar to what symbolic debuggers like gdb do).

The backtrace returns the stack trace of the C call frames, with the symbol names (function names), the object names (like ``perl''), and if it can, also the source code locations (file:line).

The supported platforms are Linux and OS X (some *BSD might work at least partly, but they have not yet been tested).

The feature needs to be enabled with Configure -Dusecbacktrace.

See C backtrace in the perlhacktips manpage for more information.


Security

Perl is now compiled with -fstack-protector-strong if available

Perl has been compiled with the anti-stack-smashing option -fstack-protector since 5.10.1. Now Perl uses the newer variant called -fstack-protector-strong, if available.

The the Safe manpage module could allow outside packages to be replaced

Critical bugfix: outside packages could be replaced. the Safe manpage has been patched to 2.38 to address this.

Perl is now always compiled with -D_FORTIFY_SOURCE=2 if available

The 'code hardening' option called _FORTIFY_SOURCE, available in gcc 4.*, is now always used for compiling Perl, if available.

Note that this isn't necessarily a huge step since in many platforms the step had already been taken several years ago: many Linux distributions (like Fedora) have been using this option for Perl, and OS X has enforced the same for many years.


Incompatible Changes

Subroutine signatures moved before attributes

The experimental sub signatures feature, as introduced in 5.20, parsed signatures after attributes. In this release, following feedback from users of the experimental feature, the positioning has been moved such that signatures occur after the subroutine name (if any) and before the attribute list (if any).

& and \& prototypes accepts only subs

The & prototype character now accepts only anonymous subs (sub {...}), things beginning with \&, or an explicit undef. Formerly it erroneously also allowed references to arrays, hashes, and lists. [perl #4539]. [perl #123062]. [perl #123062].

In addition, the \& prototype was allowing subroutine calls, whereas now it only allows subroutines: &foo is still permitted as an argument, while &foo() and foo() no longer are. [perl #77860].

use encoding is now lexical

The the encoding manpage pragma's effect is now limited to lexical scope. This pragma is deprecated, but in the meantime, it could adversely affect unrelated modules that are included in the same program; this change fixes that.

List slices returning empty lists

List slices now return an empty list only if the original list was empty (or if there are no indices). Formerly, a list slice would return an empty list if all indices fell outside the original list; now it returns a list of undef values in that case. [perl #114498].

\N{} with a sequence of multiple spaces is now a fatal error

E.g. \N{TOOMANY SPACES} or \N{TRAILING SPACE }. This has been deprecated since v5.18.

use UNIVERSAL '...' is now a fatal error

Importing functions from UNIVERSAL has been deprecated since v5.12, and is now a fatal error. use UNIVERSAL without any arguments is still allowed.

In double-quotish \cX, X must now be a printable ASCII character

In prior releases, failure to do this raised a deprecation warning.

Splitting the tokens (? and (* in regular expressions is now a fatal compilation error.

These had been deprecated since v5.18.

qr/foo/x now ignores all Unicode pattern white space

The /x regular expression modifier allows the pattern to contain white space and comments (both of which are ignored) for improved readability. Until now, not all the white space characters that Unicode designates for this purpose were handled. The additional ones now recognized are:

    U+0085 NEXT LINE
    U+200E LEFT-TO-RIGHT MARK
    U+200F RIGHT-TO-LEFT MARK
    U+2028 LINE SEPARATOR
    U+2029 PARAGRAPH SEPARATOR

The use of these characters with /x outside bracketed character classes and when not preceded by a backslash has raised a deprecation warning since v5.18. Now they will be ignored.

Comment lines within (?[ ]) are now ended only by a \n

(?[ ]) is an experimental feature, introduced in v5.18. It operates as if /x is always enabled. But there was a difference: comment lines (following a # character) were terminated by anything matching \R which includes all vertical whitespace, such as form feeds. For consistency, this is now changed to match what terminates comment lines outside (?[ ]), namely a \n (even if escaped), which is the same as what terminates a heredoc string and formats.

(?[...]) operators now follow standard Perl precedence

This experimental feature allows set operations in regular expression patterns. Prior to this, the intersection operator had the same precedence as the other binary operators. Now it has higher precedence. This could lead to different outcomes than existing code expects (though the documentation has always noted that this change might happen, recommending fully parenthesizing the expressions). See Extended Bracketed Character Classes in the perlrecharclass manpage.

Omitting % and @ on hash and array names is no longer permitted

Really old Perl let you omit the @ on array names and the % on hash names in some spots. This has issued a deprecation warning since Perl 5.000, and is no longer permitted.

"$!" text is now in English outside the scope of use locale

Previously, the text, unlike almost everything else, always came out based on the current underlying locale of the program. (Also affected on some systems is "$^E".) For programs that are unprepared to handle locale differences, this can cause garbage text to be displayed. It's better to display text that is translatable via some tool than garbage text which is much harder to figure out.

"$!" text will be returned in UTF-8 when appropriate

The stringification of $! and $^E will have the UTF-8 flag set when the text is actually non-ASCII UTF-8. This will enable programs that are set up to be locale-aware to properly output messages in the user's native language. Code that needs to continue the 5.20 and earlier behavior can do the stringification within the scopes of both use bytes and use locale ":messages". Within these two scopes, no other Perl operations will be affected by locale; only $! and $^E stringification. The bytes pragma causes the UTF-8 flag to not be set, just as in previous Perl releases. This resolves [perl #112208].

Support for ?PATTERN? without explicit operator has been removed

The m?PATTERN? construct, which allows matching a regex only once, previously had an alternative form that was written directly with a question mark delimiter, omitting the explicit m operator. This usage has produced a deprecation warning since 5.14.0. It is now a syntax error, so that the question mark can be available for use in new operators.

defined(@array) and defined(%hash) are now fatal errors

These have been deprecated since v5.6.1 and have raised deprecation warnings since v5.16.

Using a hash or an array as a reference are now fatal errors

For example, %foo->{"bar"} now causes a fatal compilation error. These have been deprecated since before v5.8, and have raised deprecation warnings since then.

Changes to the * prototype

The * character in a subroutine's prototype used to allow barewords to take precedence over most, but not all, subroutine names. It was never consistent and exhibited buggy behavior.

Now it has been changed, so subroutines always take precedence over barewords, which brings it into conformity with similarly prototyped built-in functions:

    sub splat(*) { ... }
    sub foo { ... }
    splat(foo); # now always splat(foo())
    splat(bar); # still splat('bar') as before
    close(foo); # close(foo())
    close(bar); # close('bar')


Deprecations

Setting ${^ENCODING} to anything but undef

This variable allows Perl scripts to be written in an encoding other than ASCII or UTF-8. However, it affects all modules globally, leading to wrong answers and segmentation faults. New scripts should be written in UTF-8; old scripts should be converted to UTF-8, which is easily done with the piconv utility.

Use of non-graphic characters in single-character variable names

The syntax for single-character variable names is more lenient than for longer variable names, allowing the one-character name to be a punctuation character or even invisible (a non-graphic). Perl v5.20 deprecated the ASCII-range controls as such a name. Now, all non-graphic characters that formerly were allowed are deprecated. The practical effect of this occurs only when not under use utf8>, and affects just the C1 controls (code points 0x80 through 0xFF), NO-BREAK SPACE, and SOFT HYPHEN.

Inlining of sub () { $var } with observable side-effects

In many cases Perl makes sub () { $var } into an inlinable constant subroutine, capturing the value of $var at the time the sub expression is evaluated. This can break the closure behavior in those cases where $var is subsequently modified, since the subroutine won't return the changed value. (Note that this all only applies to anonymous subroutines with an empty prototype (sub ()).)

This usage is now deprecated in those cases where the variable could be modified elsewhere. Perl detects those cases and emits a deprecation warning. Such code will likely change in the future and stop producing a constant.

If your variable is only modified in the place where it is declared, then Perl will continue to make the sub inlinable with no warnings.

    sub make_constant {
        my $var = shift;
        return sub () { $var }; # fine
    }
    sub make_constant_deprecated {
        my $var;
        $var = shift;
        return sub () { $var }; # deprecated
    }
    sub make_constant_deprecated2 {
        my $var = shift;
        log_that_value($var); # could modify $var
        return sub () { $var }; # deprecated
    }

In the second example above, detecting that $var is assigned to only once is too hard to detect. That it happens in a spot other than the my declaration is enough for Perl to find it suspicious.

This deprecation warning happens only for a simple variable for the body of the sub. (A BEGIN block or use statement inside the sub is ignored, because it does not become part of the sub's body.) For more complex cases, such as sub () { do_something() if 0; $var } the behavior has changed such that inlining does not happen if the variable is modifiable elsewhere. Such cases should be rare.

Use of multiple /x regexp modifiers

It is now deprecated to say something like any of the following:

    qr/foo/xx;
    /(?xax:foo)/;
    use re qw(/amxx);

That is, now x should only occur once in any string of contiguous regular expression pattern modifiers. We do not believe there are any occurrences of this in all of CPAN. This is in preparation for a future Perl release having /xx permit white-space for readability in bracketed character classes (those enclosed in square brackets: [...]).

Using a NO-BREAK space in a character alias for \N{...} is now deprecated

This non-graphic character is essentially indistinguishable from a regular space, and so should not be allowed. See CUSTOM ALIASES in the charnames manpage.

A literal "{" should now be escaped in a pattern

If you want a literal left curly bracket (also called a left brace) in a regular expression pattern, you should now escape it by either preceding it with a backslash ("\{") or enclosing it within square brackets "[{]", or by using \Q; otherwise a deprecation warning will be raised. This was first announced as forthcoming in the v5.16 release; it will allow future extensions to the language to happen.

Making all warnings fatal is discouraged

The documentation for fatal warnings notes that use warnings FATAL => 'all' is discouraged, and provides stronger language about the risks of fatal warnings in general.


Performance Enhancements


Modules and Pragmata

Updated Modules and Pragmata

Many of the libraries distributed with perl have been upgraded since v5.20.0. For a complete list of changes, run:

  corelist --diff 5.20.0 5.22.0

You can substitute your favorite version in place of 5.20.0, too.

Some notable changes include:

Removed Modules and Pragmata

The following modules (and associated modules) have been removed from the core perl distribution:


Documentation

New Documentation

the perlunicook manpage

This document, by Tom Christiansen, provides examples of handling Unicode in Perl.

Changes to Existing Documentation

the perlaix manpage

the perlapi manpage

the perldata manpage

the perlebcdic manpage

the perlfilter manpage

the perlfunc manpage

the perlguts manpage

the perlhack manpage

the perlhacktips manpage

the perlhpux manpage

the perllocale manpage

the perlmodstyle manpage

the perlop manpage

the perlpodspec manpage

the perlpolicy manpage

the perlport manpage

the perlre manpage

the perlrebackslash manpage

the perlrecharclass manpage

the perlref manpage

the perlsec manpage

the perlsyn manpage

the perlunicode manpage

the perluniintro manpage

the perlvar manpage

the perlvms manpage

the perlxs manpage


Diagnostics

The following additions or changes have been made to diagnostic output, including warnings and fatal error messages. For the complete list of diagnostic messages, see the perldiag manpage.

New Diagnostics

New Errors

New Warnings

Changes to Existing Diagnostics

Diagnostic Removals


Utility Changes

find2perl, s2p and a2p removal

the h2ph manpage

the encguess manpage


Configuration and Compilation


Testing


Platform Support

Regained Platforms

IRIX and Tru64 platforms are working again.
Some make test failures remain: [perl #123977] and [perl #125298] for IRIX; [perl #124212], [cpan #99605], and [cpan #104836] for Tru64.

z/OS running EBCDIC Code Page 1047
Core perl now works on this EBCDIC platform. Earlier perls also worked, but, even though support wasn't officially withdrawn, recent perls would not compile and run well. Perl 5.20 would work, but had many bugs which have now been fixed. Many CPAN modules that ship with Perl still fail tests, including Pod::Simple. However the version of Pod::Simple currently on CPAN should work; it was fixed too late to include in Perl 5.22. Work is under way to fix many of the still-broken CPAN modules, which likely will be installed on CPAN when completed, so that you may not have to wait until Perl 5.24 to get a working version.

Discontinued Platforms

NeXTSTEP/OPENSTEP
NeXTSTEP was a proprietary operating system bundled with NeXT's workstations in the early to mid 90s; OPENSTEP was an API specification that provided a NeXTSTEP-like environment on a non-NeXTSTEP system. Both are now long dead, so support for building Perl on them has been removed.

Platform-Specific Notes

EBCDIC
Special handling is required of the perl interpreter on EBCDIC platforms to get qr/[i-j]/ to match only "i" and "j", since there are 7 characters between the code points for "i" and "j". This special handling had only been invoked when both ends of the range are literals. Now it is also invoked if any of the \N{...} forms for specifying a character by name or Unicode code point is used instead of a literal. See Character Ranges in the perlrecharclass manpage.

HP-UX
The archname now distinguishes use64bitint from use64bitall.

Android
Build support has been improved for cross-compiling in general and for Android in particular.

VMS
Win32
OpenBSD
On OpenBSD, Perl will now default to using the system malloc due to the security features it provides. Perl's own malloc wrapper has been in use since v5.14 due to performance reasons, but the OpenBSD project believes the tradeoff is worth it and would prefer that users who need the speed specifically ask for it.

[perl #122000].

Solaris


Internal Changes


Selected Bug Fixes


Known Problems


Obituary

Brian McCauley died on May 8, 2015. He was a frequent poster to Usenet, Perl Monks, and other Perl forums, and made several CPAN contributions under the nick NOBULL, including to the Perl FAQ. He attended almost every YAPC::Europe, and indeed, helped organise YAPC::Europe 2006 and the QA Hackathon 2009. His wit and his delight in intricate systems were particularly apparent in his love of board games; many Perl mongers will have fond memories of playing Fluxx and other games with Brian. He will be missed.


Acknowledgements

Perl 5.22.0 represents approximately 12 months of development since Perl 5.20.0 and contains approximately 590,000 lines of changes across 2,400 files from 94 authors.

Excluding auto-generated files, documentation and release tools, there were approximately 370,000 lines of changes to 1,500 .pm, .t, .c and .h files.

Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.22.0:

Aaron Crane, Abhijit Menon-Sen, Abigail, Alberto Simões, Alex Solovey, Alex Vandiver, Alexandr Ciornii, Alexandre (Midnite) Jousset, Andreas König, Andreas Voegele, Andrew Fresh, Andy Dougherty, Anthony Heading, Aristotle Pagaltzis, brian d foy, Brian Fraser, Chad Granum, Chris 'BinGOs' Williams, Craig A. Berry, Dagfinn Ilmari Mannsåker, Daniel Dragan, Darin McBride, Dave Rolsky, David Golden, David Mitchell, David Wheeler, Dmitri Tikhonov, Doug Bell, E. Choroba, Ed J, Eric Herman, Father Chrysostomos, George Greer, Glenn D. Golden, Graham Knop, H.Merijn Brand, Herbert Breunung, Hugo van der Sanden, James E Keenan, James McCoy, James Raspass, Jan Dubois, Jarkko Hietaniemi, Jasmine Ngan, Jerry D. Hedden, Jim Cromie, John Goodyear, kafka, Karen Etheridge, Karl Williamson, Kent Fredric, kmx, Lajos Veres, Leon Timmermans, Lukas Mai, Mathieu Arnold, Matthew Horsfall, Max Maischein, Michael Bunk, Nicholas Clark, Niels Thykier, Niko Tyni, Norman Koch, Olivier Mengué, Peter John Acklam, Peter Martini, Petr Písař, Philippe Bruhat (BooK), Pierre Bogossian, Rafael Garcia-Suarez, Randy Stauner, Reini Urban, Ricardo Signes, Rob Hoelz, Rostislav Skudnov, Sawyer X, Shirakata Kentaro, Shlomi Fish, Sisyphus, Slaven Rezic, Smylers, Steffen Müller, Steve Hay, Sullivan Beck, syber, Tadeusz Sośnierz, Thomas Sibley, Todd Rinaldo, Tony Cook, Vincent Pit, Vladimir Marek, Yaroslav Kuzmin, Yves Orton, Ævar Arnfjörð Bjarmason.

The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker.

Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see the AUTHORS file in the Perl source distribution.


Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at https://rt.perl.org/. There may also be information at http://www.perl.org/, the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V, will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.


SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 perl5220delta - what is new for perl v5.22.0