diff options
author | Tom Ryder <tom@sanctum.geek.nz> | 2017-10-06 21:21:29 +1300 |
---|---|---|
committer | Tom Ryder <tom@sanctum.geek.nz> | 2017-10-06 21:21:29 +1300 |
commit | 17beaed6b54ddc6288347e91d42dd2c390282e0c (patch) | |
tree | 4c54362c902d5b8a7910e3cd3f9bd95494a2b368 | |
parent | Define PREREQ_PM (diff) | |
download | List-Breakdown-17beaed6b54ddc6288347e91d42dd2c390282e0c.tar.gz List-Breakdown-17beaed6b54ddc6288347e91d42dd2c390282e0c.zip |
Refine documentation a little
I'm mostly just playing now, seeing how nice I can get this looking with
just plain POD.
-rw-r--r-- | README.markdown | 389 | ||||
-rw-r--r-- | lib/List/Breakdown.pm | 128 |
2 files changed, 401 insertions, 116 deletions
diff --git a/README.markdown b/README.markdown index b8b83c9..5473dec 100644 --- a/README.markdown +++ b/README.markdown @@ -1,66 +1,349 @@ -List::Breakdown -=============== +# NAME -Filter elements from a list non-uniquely into a specified hash -structure, which can be nested, that pass subroutines, match regular -expressions, or fall within intervals. +List::Breakdown - Build sublist structures matching conditions +# VERSION -Installation ------------- +Version 0.17 -To install this module, run the following commands: +# SYNOPSIS - perl Makefile.PL - make - make test - make install + use List::Breakdown 'breakdown'; + ... + my @words = qw(foo bar baz quux wibble florb); + my $cats = { + all => sub { 1 }, + has_b => sub { m/ b /msx }, + has_w => sub { m/ w /msx }, + length => { + 3 => sub { length == 3 }, + 4 => sub { length == 4 }, + long => sub { length > 4 }, + }, + has_ba => qr/ba/msx, + }; + my %filtered = breakdown $cats, @words; -Support and Documentation -------------------------- +This puts the following structure in `%filtered`: -After installing, you can find documentation for this module with the -perldoc command. + ( + all => ['foo', 'bar', 'baz', 'quux', 'wibble', 'florb'], + has_b => ['bar', 'baz', 'wibble', 'florb'], + has_w => ['wibble'], + length => { + 3 => ['foo', 'bar', 'baz'], + 4 => ['quux'], + long => ['wibble', 'florb'], + }, + has_ba => ['bar', 'baz'], + ) + +# DESCRIPTION + +This module assists you in making a _breakdown_ of a list, copying and +filtering its items into a structured bucket layout according to your +specifications. Think of it as a syntax for [`grep`](https://metacpan.org/pod/perlfunc#grep-BLOCK-LIST) that returns named and structured results from one list. + +It differs from the excellent [List::Categorize](https://metacpan.org/pod/List::Categorize) in the use +of references to define each category, and in not requiring only one final +category for any given item; an item can end up in the result set for more than +one filter. + +If you want to divide or _partition_ your list so that each item can only +appear in one category, you may want either +[List::MoreUtils](https://metacpan.org/pod/List::MoreUtils#Partitioning) or possibly +[Set::Partition](https://metacpan.org/pod/Set::Partition) instead. + +# SUBROUTINES/METHODS + +## `breakdown(\%spec, @items)` + +This is the only exportable subroutine. Given a hash reference structure and a +list of items, it applies each of the referenced values as tests, returning a +new hash in the same structure with the references replaced with the matching +items, in the same way as [`grep`](https://metacpan.org/pod/perlfunc#grep-BLOCK-LIST). + +There are two shortcut syntaxes for a value in the `\%spec` structure: + +- `ARRAY` + + If the referenced array has exactly two items, it will be interpreted as + defining numeric bounds `[lower,upper)` for its values. `undef` can be used + to denote negative or positive infinity. Any other number of items is a fatal + error. + +- `Regexp` + + This will be interpreted as a pattern for the list items to match. + +Additionally, if the value is a `HASH` reference, it can be used to make a +sub-part of the structure, as demonstrated in the `length` key of the example +`\%spec` given in [SYNOPSIS](#synopsis). + +# EXAMPLES + +## Collecting troublesome records + +Suppose you have a list of strings from a very legacy system that you need to +regularly check for problematic characters, alerting you to problems with an +imperfect Perl parser: + + my @records = ( + "NEW CUSTOMER John O''Connor\r 2017-01-01", + "RETURNING CUSTOMER\tXah Zhang 2016-01-01", + "CHECK ACCOUNT Pierre d'Alun 2016-12-01", + "RETURNING CUSTOMER Aaron Carter 2016-05-01" + ); + +You could have a bucket structure like this, using the **pattern syntax**, which +catches certain error types you've seen before for review: + + my %buckets = ( + bad_whitespace => qr/ [\r\t] /msx, + apostrophes => qr/ ' /msx, + double_apostrophes => qr/ '' /msx, + not_ascii => qr/ [^[:ascii:]] /msx + }; + +Applying the bucket structure like so: + + my %results = breakdown \%buckets, @records; + +The result set would look like this: + + my %expected = ( + bad_whitespace => [ + "NEW CUSTOMER John O''Connor\r 2017-01-01", + "RETURNING CUSTOMER\tXah Lee 2016-01-01" + ], + apostrophes => [ + "NEW CUSTOMER John O''Connor\r 2017-01-01", + 'CHECK ACCOUNT Pierre d\'Alun 2016-12-01' + ], + double_apostrophes => [ + "NEW CUSTOMER John O''Connor\r 2017-01-01" + ], + not_ascii => [ + ] + ); + +Notice that some of the lines appear in more than one list, and that the +`not_ascii` bucket is empty, because none of the items matched it. + +## Monitoring system check results + +Suppose you ran a list of checks with your monitoring system, and now you have +a list of `HASH` references with keys describing each check and its outcome: + + my @checks = ( + { + hostname => 'webserver1', + status => 'OK', + }, + { + hostname => 'webserver2', + status => 'CRITICAL', + }, + { + hostname => 'webserver3', + status => 'WARNING', + }, + { + hostname => 'webserver4', + status => 'OK', + } + ); + +You would like to break the list down by status. You would lay out your buckets +like so, using the **subroutine syntax**: + + my %buckets = ( + ok => sub { $_->{status} eq 'OK' }, + problem => { + warning => sub { $_->{status} eq 'WARNING' }, + critical => sub { $_->{status} eq 'CRITICAL' }, + unknown => sub { $_->{status} eq 'UNKNOWN' }, + }, + ); + +And apply them like so: + + my %results = breakdown \%buckets, @checks; + +For our sample data above, this would yield the following structure in +`%results`: + + ( + ok => [ + { + hostname => 'webserver1', + status => 'OK' + }, + { + hostname => 'webserver4', + status => 'OK' + } + ], + problem => { + warning => [ + { + hostname => 'webserver3', + status => 'WARNING' + } + ], + critical => [ + { + hostname => 'webserver2', + status => 'CRITICAL' + } + ], + unknown => [] + } + ) + +Note the extra level of `HASH` references beneath the `problem` key. + +## Grouping numbers by size + +Suppose you have a list of numbers from your volcanic activity reporting +system, some of which might be merely worrisome, and some others an emergency, +and they need to be filtered to know where to send them: + + my @numbers = ( 1, 32, 3718.4, 0x56, 0777, 3.14, -5, 1.2e5 ); + +You could filter them into buckets like this, using the **interval syntax**: an +`ARRAY` reference with exactly two elements: lower bound (inclusive) first, +upper bound (exclusive) second: + + my $filters = { + negative => [ undef, 0 ], + positive => { + small => [ 0, 10 ], + medium => [ 10, 100 ], + large => [ 100, undef ], + }, + }; + +Applying the bucket structure like so: + + my %filtered = breakdown $filters, @numbers; + +The result set would look like this: + + my %expected = ( + negative => [ -5 ] + positive => { + small => [ 1, 3.14 ], + medium => [ 32, 86 ], + large => [ 3_718.4, 511, 120_000 ] + }, + ); + +Notice that you can express infinity or negative infinity as `undef`. Note +also this is a numeric comparison only. + +# AUTHOR + +Tom Ryder `<tom@sanctum.geek.nz>` + +# DIAGNOSTICS + +- `HASH reference expected for first argument` + + The first argument that `breakdown()` saw wasn't the hash reference it expects. + That's the only format a spec is allowed to have. + +- `Reference expected for '%s'` + + The value for the named key in the spec was not a reference, and one was + expected. + +- `Unhandled ref type %s for '%s'` + + The value for the named key in the spec is of a type that makes no sense to + this module. Legal reference types are `ARRAY`, `CODE`, `HASH`, and + `Regexp`. + +# DEPENDENCIES + +- Perl 5.6.0 or newer +- [base](https://metacpan.org/pod/base) +- [Carp](https://metacpan.org/pod/Carp) +- [Exporter](https://metacpan.org/pod/Exporter) + +# CONFIGURATION AND ENVIRONMENT + +None required. + +# INCOMPATIBILITIES + +None known. + +# BUGS AND LIMITATIONS + +Definitely. This is a very early release. Please report any bugs or feature +requests to `tom@sanctum.geek.nz`. + +# SUPPORT + +You can find documentation for this module with the **perldoc** command. perldoc List::Breakdown -License and Copyright ---------------------- +You can also look for information at: + +- RT: CPAN's request tracker (report bugs here) + + [http://rt.cpan.org/NoAuth/Bugs.html?Dist=List-Breakdown](http://rt.cpan.org/NoAuth/Bugs.html?Dist=List-Breakdown) + +- AnnoCPAN: Annotated CPAN documentation + + [http://annocpan.org/dist/List-Breakdown](http://annocpan.org/dist/List-Breakdown) + +- CPAN Ratings + + [http://cpanratings.perl.org/d/List-Breakdown](http://cpanratings.perl.org/d/List-Breakdown) + +- Search CPAN + + [http://search.cpan.org/dist/List-Breakdown/](http://search.cpan.org/dist/List-Breakdown/) + +# LICENSE AND COPYRIGHT Copyright (C) 2017 Tom Ryder -This program is free software; you can redistribute it and/or modify it -under the terms of the the Artistic License (2.0). You may obtain a -copy of the full license at: - -<http://www.perlfoundation.org/artistic_license_2_0> - -Any use, modification, and distribution of the Standard or Modified -Versions is governed by this Artistic License. By using, modifying or -distributing the Package, you accept this license. Do not use, modify, -or distribute the Package, if you do not accept this license. - -If your Modified Version has been derived from a Modified Version made -by someone other than you, you are nevertheless required to ensure that -your Modified Version complies with the requirements of this license. - -This license does not grant you the right to use any trademark, service -mark, tradename, or logo of the Copyright Holder. - -This license includes the non-exclusive, worldwide, free-of-charge -patent license to make, have made, use, offer to sell, sell, import and -otherwise transfer the Package with respect to any patent claims -licensable by the Copyright Holder that are necessarily infringed by the -Package. If you institute patent litigation (including a cross-claim or -counterclaim) against any party alleging that the Package constitutes -direct or contributory patent infringement, then this Artistic License -to you shall terminate on the date that such litigation is filed. - -Disclaimer of Warranty: THE PACKAGE IS PROVIDED BY THE COPYRIGHT HOLDER -AND CONTRIBUTORS "AS IS' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES. -THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR -PURPOSE, OR NON-INFRINGEMENT ARE DISCLAIMED TO THE EXTENT PERMITTED BY -YOUR LOCAL LAW. UNLESS REQUIRED BY LAW, NO COPYRIGHT HOLDER OR -CONTRIBUTOR WILL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, OR -CONSEQUENTIAL DAMAGES ARISING IN ANY WAY OUT OF THE USE OF THE PACKAGE, -EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +This program is free software; you can redistribute it and/or modify it under +the terms of the Artistic License (2.0). You may obtain a copy of the full +license at: + +[http://www.perlfoundation.org/artistic\_license\_2\_0](http://www.perlfoundation.org/artistic_license_2_0) + +Any use, modification, and distribution of the Standard or Modified Versions is +governed by this Artistic License. By using, modifying or distributing the +Package, you accept this license. Do not use, modify, or distribute the +Package, if you do not accept this license. + +If your Modified Version has been derived from a Modified Version made by +someone other than you, you are nevertheless required to ensure that your +Modified Version complies with the requirements of this license. + +This license does not grant you the right to use any trademark, service mark, +tradename, or logo of the Copyright Holder. + +This license includes the non-exclusive, worldwide, free-of-charge patent +license to make, have made, use, offer to sell, sell, import and otherwise +transfer the Package with respect to any patent claims licensable by the +Copyright Holder that are necessarily infringed by the Package. If you +institute patent litigation (including a cross-claim or counterclaim) against +any party alleging that the Package constitutes direct or contributory patent +infringement, then this Artistic License to you shall terminate on the date +that such litigation is filed. + +Disclaimer of Warranty: THE PACKAGE IS PROVIDED BY THE COPYRIGHT HOLDER AND +CONTRIBUTORS "AS IS' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES. THE IMPLIED +WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR +NON-INFRINGEMENT ARE DISCLAIMED TO THE EXTENT PERMITTED BY YOUR LOCAL LAW. +UNLESS REQUIRED BY LAW, NO COPYRIGHT HOLDER OR CONTRIBUTOR WILL BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING IN ANY WAY +OUT OF THE USE OF THE PACKAGE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH +DAMAGE. diff --git a/lib/List/Breakdown.pm b/lib/List/Breakdown.pm index 291a329..653af62 100644 --- a/lib/List/Breakdown.pm +++ b/lib/List/Breakdown.pm @@ -96,7 +96,7 @@ __END__ =for stopwords sublists Unhandled tradename licensable MERCHANTABILITY hashrefs CPAN AnnoCPAN -syntaxes +syntaxes perldoc =head1 NAME @@ -106,19 +106,6 @@ List::Breakdown - Build sublist structures matching conditions Version 0.17 -=head1 DESCRIPTION - -This module "breaks down" a list--filtering elements from a list into a -specified bucket layout. It may be useful in situations where you have a big -list of things to generate reports on, or to otherwise filter into several -sublists. - -It differs from the excellent C<List::Categorize> in the use of subroutine -references for each category and in not requiring only one final category for -any given item; an item can end up in the result set for more than one filter. - -You could maybe think of this as a multi-C<grep> that returns named results. - =head1 SYNOPSIS use List::Breakdown 'breakdown'; @@ -151,33 +138,53 @@ This puts the following structure in C<%filtered>: has_ba => ['bar', 'baz'], ) +=head1 DESCRIPTION + +This module assists you in making a I<breakdown> of a list, copying and +filtering its items into a structured bucket layout according to your +specifications. Think of it as a syntax for L<C<grep>|perlfunc/"grep BLOCK +LIST"> that returns named and structured results from one list. + +It differs from the excellent L<List::Categorize|List::Categorize> in the use +of references to define each category, and in not requiring only one final +category for any given item; an item can end up in the result set for more than +one filter. + +If you want to divide or I<partition> your list so that each item can only +appear in one category, you may want either +L<List::MoreUtils|List::MoreUtils/"Partitioning"> or possibly +L<Set::Partition|Set::Partition> instead. + =head1 SUBROUTINES/METHODS -=head2 B<breakdown(\%spec, @items)> +=head2 C<breakdown(\%spec, @items)> -Exportable subroutine; given a hash reference structure and a list of items, -apply each of the subroutines or regular expressions given as values of the -hash reference, returning a new hash in the same structure with the tests -replaced with the items for which the subroutine returns true, in the same way -as C<grep>. +This is the only exportable subroutine. Given a hash reference structure and a +list of items, it applies each of the referenced values as tests, returning a +new hash in the same structure with the references replaced with the matching +items, in the same way as L<C<grep>|perlfunc/"grep BLOCK LIST">. -There are two shortcut syntaxes: +There are two shortcut syntaxes for a value in the C<\%spec> structure: =over 4 -=item * +=item * C<ARRAY> -If a value in the C<spec> structure is an C<ARRAY> reference with two items, it -will be interpreted as defining bounds C<[lower,upper)> for matched values. -C<undef> can be used to denote negative or positive infinity. +If the referenced array has exactly two items, it will be interpreted as +defining numeric bounds C<[lower,upper)> for its values. C<undef> can be used +to denote negative or positive infinity. Any other number of items is a fatal +error. -=item * +=item * C<Regexp> -If it's a C<Regexp> reference, it will be interpreted as a pattern to match -against all of the items, and will return the items that match. +This will be interpreted as a pattern for the list items to match. =back +Additionally, if the value is a C<HASH> reference, it can be used to make a +sub-part of the structure, as demonstrated in the C<length> key of the example +C<\%spec> given in L<SYNOPSIS|/SYNOPSIS>. + =head1 EXAMPLES =head2 Collecting troublesome records @@ -193,18 +200,15 @@ imperfect Perl parser: "RETURNING CUSTOMER Aaron Carter 2016-05-01" ); -You could have a bucket structure like this, which catches certain error types -you've seen before for review: +You could have a bucket structure like this, using the B<pattern syntax>, which +catches certain error types you've seen before for review: my %buckets = ( bad_whitespace => qr/ [\r\t] /msx, apostrophes => qr/ ' /msx, double_apostrophes => qr/ '' /msx, not_ascii => qr/ [^[:ascii:]] /msx - ); - -Notice that you don't have to wrap a quoted regular expression to match in a -C<sub> subroutine reference, as a convenience shortcut. + }; Applying the bucket structure like so: @@ -229,12 +233,12 @@ The result set would look like this: ); Notice that some of the lines appear in more than one list, and that the -C<not_ascii> bucket is empty because none of the items matched it. +C<not_ascii> bucket is empty, because none of the items matched it. =head2 Monitoring system check results -Suppose you ran a list of checks with your monitoring system, and you have a -list of hashrefs describing each check and its outcome: +Suppose you ran a list of checks with your monitoring system, and now you have +a list of C<HASH> references with keys describing each check and its outcome: my @checks = ( { @@ -255,8 +259,8 @@ list of hashrefs describing each check and its outcome: } ); -You would like to break the list down by status. Using C<List::Breakdown>, you -would lay out your buckets like so: +You would like to break the list down by status. You would lay out your buckets +like so, using the B<subroutine syntax>: my %buckets = ( ok => sub { $_->{status} eq 'OK' }, @@ -271,9 +275,6 @@ And apply them like so: my %results = breakdown \%buckets, @checks; -You can then apply C<%buckets> to any other list you may need to check in the -same way to get the same structure. - For our sample data above, this would yield the following structure in C<%results>: @@ -305,19 +306,19 @@ C<%results>: } ) -Note the extra level of hash referencing beneath the C<problem> key. +Note the extra level of C<HASH> references beneath the C<problem> key. =head2 Grouping numbers by size -Suppose you have a list of stray numbers from your volcanic activity reporting -system, some of which might be merely worrisome and some an emergency, and they -need to be filtered to know where to send them: +Suppose you have a list of numbers from your volcanic activity reporting +system, some of which might be merely worrisome, and some others an emergency, +and they need to be filtered to know where to send them: my @numbers = ( 1, 32, 3718.4, 0x56, 0777, 3.14, -5, 1.2e5 ); -You could filter them into buckets like this, using the interval syntax; an -array reference with exactly two elements; lower bound (inclusive) first, upper -bound (exclusive) second: +You could filter them into buckets like this, using the B<interval syntax>: an +C<ARRAY> reference with exactly two elements: lower bound (inclusive) first, +upper bound (exclusive) second: my $filters = { negative => [ undef, 0 ], @@ -354,20 +355,21 @@ Tom Ryder C<< <tom@sanctum.geek.nz> >> =over 4 -=item HASH reference expected for first argument +=item C<HASH reference expected for first argument> -The first argument that B<breakdown()> saw wasn't the hash reference it expects. +The first argument that C<breakdown()> saw wasn't the hash reference it expects. That's the only format a spec is allowed to have. -=item Reference expected for '%s' +=item C<Reference expected for '%s'> The value for the named key in the spec was not a reference, and one was expected. -=item Unhandled ref type %s for '%s' +=item C<Unhandled ref type %s for '%s'> The value for the named key in the spec is of a type that makes no sense to -this module. Legal reference types are C<HASH>, C<CODE>, and C<Regexp>. +this module. Legal reference types are C<ARRAY>, C<CODE>, C<HASH>, and +C<Regexp>. =back @@ -381,15 +383,15 @@ Perl 5.6.0 or newer =item * -C<base> +L<base|base> =item * -C<Carp> +L<Carp|Carp> =item * -C<Exporter> +L<Exporter|Exporter> =back @@ -408,7 +410,7 @@ requests to C<tom@sanctum.geek.nz>. =head1 SUPPORT -You can find documentation for this module with the C<perldoc> command. +You can find documentation for this module with the B<perldoc> command. perldoc List::Breakdown @@ -418,19 +420,19 @@ You can also look for information at: =item * RT: CPAN's request tracker (report bugs here) -<http://rt.cpan.org/NoAuth/Bugs.html?Dist=List-Breakdown> +L<http://rt.cpan.org/NoAuth/Bugs.html?Dist=List-Breakdown> =item * AnnoCPAN: Annotated CPAN documentation -<http://annocpan.org/dist/List-Breakdown> +L<http://annocpan.org/dist/List-Breakdown> =item * CPAN Ratings -<http://cpanratings.perl.org/d/List-Breakdown> +L<http://cpanratings.perl.org/d/List-Breakdown> =item * Search CPAN -<http://search.cpan.org/dist/List-Breakdown/> +L<http://search.cpan.org/dist/List-Breakdown/> =back @@ -442,7 +444,7 @@ This program is free software; you can redistribute it and/or modify it under the terms of the Artistic License (2.0). You may obtain a copy of the full license at: -<http://www.perlfoundation.org/artistic_license_2_0> +L<http://www.perlfoundation.org/artistic_license_2_0> Any use, modification, and distribution of the Standard or Modified Versions is governed by this Artistic License. By using, modifying or distributing the |