aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorTom Ryder <tom@sanctum.geek.nz>2017-10-06 21:21:29 +1300
committerTom Ryder <tom@sanctum.geek.nz>2017-10-06 21:21:29 +1300
commit17beaed6b54ddc6288347e91d42dd2c390282e0c (patch)
tree4c54362c902d5b8a7910e3cd3f9bd95494a2b368
parentd02210e4e0accb9d4ae8aa69d9fb980dbd956d2e (diff)
downloadList-Breakdown-17beaed6b54ddc6288347e91d42dd2c390282e0c.tar.gz
Refine documentation a little
I'm mostly just playing now, seeing how nice I can get this looking with just plain POD.
-rw-r--r--README.markdown389
-rw-r--r--lib/List/Breakdown.pm128
2 files changed, 401 insertions, 116 deletions
diff --git a/README.markdown b/README.markdown
index b8b83c9..5473dec 100644
--- a/README.markdown
+++ b/README.markdown
@@ -1,66 +1,349 @@
-List::Breakdown
-===============
+# NAME
-Filter elements from a list non-uniquely into a specified hash
-structure, which can be nested, that pass subroutines, match regular
-expressions, or fall within intervals.
+List::Breakdown - Build sublist structures matching conditions
+# VERSION
-Installation
-------------
+Version 0.17
-To install this module, run the following commands:
+# SYNOPSIS
- perl Makefile.PL
- make
- make test
- make install
+ use List::Breakdown 'breakdown';
+ ...
+ my @words = qw(foo bar baz quux wibble florb);
+ my $cats = {
+ all => sub { 1 },
+ has_b => sub { m/ b /msx },
+ has_w => sub { m/ w /msx },
+ length => {
+ 3 => sub { length == 3 },
+ 4 => sub { length == 4 },
+ long => sub { length > 4 },
+ },
+ has_ba => qr/ba/msx,
+ };
+ my %filtered = breakdown $cats, @words;
-Support and Documentation
--------------------------
+This puts the following structure in `%filtered`:
-After installing, you can find documentation for this module with the
-perldoc command.
+ (
+ all => ['foo', 'bar', 'baz', 'quux', 'wibble', 'florb'],
+ has_b => ['bar', 'baz', 'wibble', 'florb'],
+ has_w => ['wibble'],
+ length => {
+ 3 => ['foo', 'bar', 'baz'],
+ 4 => ['quux'],
+ long => ['wibble', 'florb'],
+ },
+ has_ba => ['bar', 'baz'],
+ )
+
+# DESCRIPTION
+
+This module assists you in making a _breakdown_ of a list, copying and
+filtering its items into a structured bucket layout according to your
+specifications. Think of it as a syntax for [`grep`](https://metacpan.org/pod/perlfunc#grep-BLOCK-LIST) that returns named and structured results from one list.
+
+It differs from the excellent [List::Categorize](https://metacpan.org/pod/List::Categorize) in the use
+of references to define each category, and in not requiring only one final
+category for any given item; an item can end up in the result set for more than
+one filter.
+
+If you want to divide or _partition_ your list so that each item can only
+appear in one category, you may want either
+[List::MoreUtils](https://metacpan.org/pod/List::MoreUtils#Partitioning) or possibly
+[Set::Partition](https://metacpan.org/pod/Set::Partition) instead.
+
+# SUBROUTINES/METHODS
+
+## `breakdown(\%spec, @items)`
+
+This is the only exportable subroutine. Given a hash reference structure and a
+list of items, it applies each of the referenced values as tests, returning a
+new hash in the same structure with the references replaced with the matching
+items, in the same way as [`grep`](https://metacpan.org/pod/perlfunc#grep-BLOCK-LIST).
+
+There are two shortcut syntaxes for a value in the `\%spec` structure:
+
+- `ARRAY`
+
+ If the referenced array has exactly two items, it will be interpreted as
+ defining numeric bounds `[lower,upper)` for its values. `undef` can be used
+ to denote negative or positive infinity. Any other number of items is a fatal
+ error.
+
+- `Regexp`
+
+ This will be interpreted as a pattern for the list items to match.
+
+Additionally, if the value is a `HASH` reference, it can be used to make a
+sub-part of the structure, as demonstrated in the `length` key of the example
+`\%spec` given in [SYNOPSIS](#synopsis).
+
+# EXAMPLES
+
+## Collecting troublesome records
+
+Suppose you have a list of strings from a very legacy system that you need to
+regularly check for problematic characters, alerting you to problems with an
+imperfect Perl parser:
+
+ my @records = (
+ "NEW CUSTOMER John O''Connor\r 2017-01-01",
+ "RETURNING CUSTOMER\tXah Zhang 2016-01-01",
+ "CHECK ACCOUNT Pierre d'Alun 2016-12-01",
+ "RETURNING CUSTOMER Aaron Carter 2016-05-01"
+ );
+
+You could have a bucket structure like this, using the **pattern syntax**, which
+catches certain error types you've seen before for review:
+
+ my %buckets = (
+ bad_whitespace => qr/ [\r\t] /msx,
+ apostrophes => qr/ ' /msx,
+ double_apostrophes => qr/ '' /msx,
+ not_ascii => qr/ [^[:ascii:]] /msx
+ };
+
+Applying the bucket structure like so:
+
+ my %results = breakdown \%buckets, @records;
+
+The result set would look like this:
+
+ my %expected = (
+ bad_whitespace => [
+ "NEW CUSTOMER John O''Connor\r 2017-01-01",
+ "RETURNING CUSTOMER\tXah Lee 2016-01-01"
+ ],
+ apostrophes => [
+ "NEW CUSTOMER John O''Connor\r 2017-01-01",
+ 'CHECK ACCOUNT Pierre d\'Alun 2016-12-01'
+ ],
+ double_apostrophes => [
+ "NEW CUSTOMER John O''Connor\r 2017-01-01"
+ ],
+ not_ascii => [
+ ]
+ );
+
+Notice that some of the lines appear in more than one list, and that the
+`not_ascii` bucket is empty, because none of the items matched it.
+
+## Monitoring system check results
+
+Suppose you ran a list of checks with your monitoring system, and now you have
+a list of `HASH` references with keys describing each check and its outcome:
+
+ my @checks = (
+ {
+ hostname => 'webserver1',
+ status => 'OK',
+ },
+ {
+ hostname => 'webserver2',
+ status => 'CRITICAL',
+ },
+ {
+ hostname => 'webserver3',
+ status => 'WARNING',
+ },
+ {
+ hostname => 'webserver4',
+ status => 'OK',
+ }
+ );
+
+You would like to break the list down by status. You would lay out your buckets
+like so, using the **subroutine syntax**:
+
+ my %buckets = (
+ ok => sub { $_->{status} eq 'OK' },
+ problem => {
+ warning => sub { $_->{status} eq 'WARNING' },
+ critical => sub { $_->{status} eq 'CRITICAL' },
+ unknown => sub { $_->{status} eq 'UNKNOWN' },
+ },
+ );
+
+And apply them like so:
+
+ my %results = breakdown \%buckets, @checks;
+
+For our sample data above, this would yield the following structure in
+`%results`:
+
+ (
+ ok => [
+ {
+ hostname => 'webserver1',
+ status => 'OK'
+ },
+ {
+ hostname => 'webserver4',
+ status => 'OK'
+ }
+ ],
+ problem => {
+ warning => [
+ {
+ hostname => 'webserver3',
+ status => 'WARNING'
+ }
+ ],
+ critical => [
+ {
+ hostname => 'webserver2',
+ status => 'CRITICAL'
+ }
+ ],
+ unknown => []
+ }
+ )
+
+Note the extra level of `HASH` references beneath the `problem` key.
+
+## Grouping numbers by size
+
+Suppose you have a list of numbers from your volcanic activity reporting
+system, some of which might be merely worrisome, and some others an emergency,
+and they need to be filtered to know where to send them:
+
+ my @numbers = ( 1, 32, 3718.4, 0x56, 0777, 3.14, -5, 1.2e5 );
+
+You could filter them into buckets like this, using the **interval syntax**: an
+`ARRAY` reference with exactly two elements: lower bound (inclusive) first,
+upper bound (exclusive) second:
+
+ my $filters = {
+ negative => [ undef, 0 ],
+ positive => {
+ small => [ 0, 10 ],
+ medium => [ 10, 100 ],
+ large => [ 100, undef ],
+ },
+ };
+
+Applying the bucket structure like so:
+
+ my %filtered = breakdown $filters, @numbers;
+
+The result set would look like this:
+
+ my %expected = (
+ negative => [ -5 ]
+ positive => {
+ small => [ 1, 3.14 ],
+ medium => [ 32, 86 ],
+ large => [ 3_718.4, 511, 120_000 ]
+ },
+ );
+
+Notice that you can express infinity or negative infinity as `undef`. Note
+also this is a numeric comparison only.
+
+# AUTHOR
+
+Tom Ryder `<tom@sanctum.geek.nz>`
+
+# DIAGNOSTICS
+
+- `HASH reference expected for first argument`
+
+ The first argument that `breakdown()` saw wasn't the hash reference it expects.
+ That's the only format a spec is allowed to have.
+
+- `Reference expected for '%s'`
+
+ The value for the named key in the spec was not a reference, and one was
+ expected.
+
+- `Unhandled ref type %s for '%s'`
+
+ The value for the named key in the spec is of a type that makes no sense to
+ this module. Legal reference types are `ARRAY`, `CODE`, `HASH`, and
+ `Regexp`.
+
+# DEPENDENCIES
+
+- Perl 5.6.0 or newer
+- [base](https://metacpan.org/pod/base)
+- [Carp](https://metacpan.org/pod/Carp)
+- [Exporter](https://metacpan.org/pod/Exporter)
+
+# CONFIGURATION AND ENVIRONMENT
+
+None required.
+
+# INCOMPATIBILITIES
+
+None known.
+
+# BUGS AND LIMITATIONS
+
+Definitely. This is a very early release. Please report any bugs or feature
+requests to `tom@sanctum.geek.nz`.
+
+# SUPPORT
+
+You can find documentation for this module with the **perldoc** command.
perldoc List::Breakdown
-License and Copyright
----------------------
+You can also look for information at:
+
+- RT: CPAN's request tracker (report bugs here)
+
+ [http://rt.cpan.org/NoAuth/Bugs.html?Dist=List-Breakdown](http://rt.cpan.org/NoAuth/Bugs.html?Dist=List-Breakdown)
+
+- AnnoCPAN: Annotated CPAN documentation
+
+ [http://annocpan.org/dist/List-Breakdown](http://annocpan.org/dist/List-Breakdown)
+
+- CPAN Ratings
+
+ [http://cpanratings.perl.org/d/List-Breakdown](http://cpanratings.perl.org/d/List-Breakdown)
+
+- Search CPAN
+
+ [http://search.cpan.org/dist/List-Breakdown/](http://search.cpan.org/dist/List-Breakdown/)
+
+# LICENSE AND COPYRIGHT
Copyright (C) 2017 Tom Ryder
-This program is free software; you can redistribute it and/or modify it
-under the terms of the the Artistic License (2.0). You may obtain a
-copy of the full license at:
-
-<http://www.perlfoundation.org/artistic_license_2_0>
-
-Any use, modification, and distribution of the Standard or Modified
-Versions is governed by this Artistic License. By using, modifying or
-distributing the Package, you accept this license. Do not use, modify,
-or distribute the Package, if you do not accept this license.
-
-If your Modified Version has been derived from a Modified Version made
-by someone other than you, you are nevertheless required to ensure that
-your Modified Version complies with the requirements of this license.
-
-This license does not grant you the right to use any trademark, service
-mark, tradename, or logo of the Copyright Holder.
-
-This license includes the non-exclusive, worldwide, free-of-charge
-patent license to make, have made, use, offer to sell, sell, import and
-otherwise transfer the Package with respect to any patent claims
-licensable by the Copyright Holder that are necessarily infringed by the
-Package. If you institute patent litigation (including a cross-claim or
-counterclaim) against any party alleging that the Package constitutes
-direct or contributory patent infringement, then this Artistic License
-to you shall terminate on the date that such litigation is filed.
-
-Disclaimer of Warranty: THE PACKAGE IS PROVIDED BY THE COPYRIGHT HOLDER
-AND CONTRIBUTORS "AS IS' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES.
-THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
-PURPOSE, OR NON-INFRINGEMENT ARE DISCLAIMED TO THE EXTENT PERMITTED BY
-YOUR LOCAL LAW. UNLESS REQUIRED BY LAW, NO COPYRIGHT HOLDER OR
-CONTRIBUTOR WILL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, OR
-CONSEQUENTIAL DAMAGES ARISING IN ANY WAY OUT OF THE USE OF THE PACKAGE,
-EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+This program is free software; you can redistribute it and/or modify it under
+the terms of the Artistic License (2.0). You may obtain a copy of the full
+license at:
+
+[http://www.perlfoundation.org/artistic\_license\_2\_0](http://www.perlfoundation.org/artistic_license_2_0)
+
+Any use, modification, and distribution of the Standard or Modified Versions is
+governed by this Artistic License. By using, modifying or distributing the
+Package, you accept this license. Do not use, modify, or distribute the
+Package, if you do not accept this license.
+
+If your Modified Version has been derived from a Modified Version made by
+someone other than you, you are nevertheless required to ensure that your
+Modified Version complies with the requirements of this license.
+
+This license does not grant you the right to use any trademark, service mark,
+tradename, or logo of the Copyright Holder.
+
+This license includes the non-exclusive, worldwide, free-of-charge patent
+license to make, have made, use, offer to sell, sell, import and otherwise
+transfer the Package with respect to any patent claims licensable by the
+Copyright Holder that are necessarily infringed by the Package. If you
+institute patent litigation (including a cross-claim or counterclaim) against
+any party alleging that the Package constitutes direct or contributory patent
+infringement, then this Artistic License to you shall terminate on the date
+that such litigation is filed.
+
+Disclaimer of Warranty: THE PACKAGE IS PROVIDED BY THE COPYRIGHT HOLDER AND
+CONTRIBUTORS "AS IS' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES. THE IMPLIED
+WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR
+NON-INFRINGEMENT ARE DISCLAIMED TO THE EXTENT PERMITTED BY YOUR LOCAL LAW.
+UNLESS REQUIRED BY LAW, NO COPYRIGHT HOLDER OR CONTRIBUTOR WILL BE LIABLE FOR
+ANY DIRECT, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING IN ANY WAY
+OUT OF THE USE OF THE PACKAGE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
+DAMAGE.
diff --git a/lib/List/Breakdown.pm b/lib/List/Breakdown.pm
index 291a329..653af62 100644
--- a/lib/List/Breakdown.pm
+++ b/lib/List/Breakdown.pm
@@ -96,7 +96,7 @@ __END__
=for stopwords
sublists Unhandled tradename licensable MERCHANTABILITY hashrefs CPAN AnnoCPAN
-syntaxes
+syntaxes perldoc
=head1 NAME
@@ -106,19 +106,6 @@ List::Breakdown - Build sublist structures matching conditions
Version 0.17
-=head1 DESCRIPTION
-
-This module "breaks down" a list--filtering elements from a list into a
-specified bucket layout. It may be useful in situations where you have a big
-list of things to generate reports on, or to otherwise filter into several
-sublists.
-
-It differs from the excellent C<List::Categorize> in the use of subroutine
-references for each category and in not requiring only one final category for
-any given item; an item can end up in the result set for more than one filter.
-
-You could maybe think of this as a multi-C<grep> that returns named results.
-
=head1 SYNOPSIS
use List::Breakdown 'breakdown';
@@ -151,33 +138,53 @@ This puts the following structure in C<%filtered>:
has_ba => ['bar', 'baz'],
)
+=head1 DESCRIPTION
+
+This module assists you in making a I<breakdown> of a list, copying and
+filtering its items into a structured bucket layout according to your
+specifications. Think of it as a syntax for L<C<grep>|perlfunc/"grep BLOCK
+LIST"> that returns named and structured results from one list.
+
+It differs from the excellent L<List::Categorize|List::Categorize> in the use
+of references to define each category, and in not requiring only one final
+category for any given item; an item can end up in the result set for more than
+one filter.
+
+If you want to divide or I<partition> your list so that each item can only
+appear in one category, you may want either
+L<List::MoreUtils|List::MoreUtils/"Partitioning"> or possibly
+L<Set::Partition|Set::Partition> instead.
+
=head1 SUBROUTINES/METHODS
-=head2 B<breakdown(\%spec, @items)>
+=head2 C<breakdown(\%spec, @items)>
-Exportable subroutine; given a hash reference structure and a list of items,
-apply each of the subroutines or regular expressions given as values of the
-hash reference, returning a new hash in the same structure with the tests
-replaced with the items for which the subroutine returns true, in the same way
-as C<grep>.
+This is the only exportable subroutine. Given a hash reference structure and a
+list of items, it applies each of the referenced values as tests, returning a
+new hash in the same structure with the references replaced with the matching
+items, in the same way as L<C<grep>|perlfunc/"grep BLOCK LIST">.
-There are two shortcut syntaxes:
+There are two shortcut syntaxes for a value in the C<\%spec> structure:
=over 4
-=item *
+=item * C<ARRAY>
-If a value in the C<spec> structure is an C<ARRAY> reference with two items, it
-will be interpreted as defining bounds C<[lower,upper)> for matched values.
-C<undef> can be used to denote negative or positive infinity.
+If the referenced array has exactly two items, it will be interpreted as
+defining numeric bounds C<[lower,upper)> for its values. C<undef> can be used
+to denote negative or positive infinity. Any other number of items is a fatal
+error.
-=item *
+=item * C<Regexp>
-If it's a C<Regexp> reference, it will be interpreted as a pattern to match
-against all of the items, and will return the items that match.
+This will be interpreted as a pattern for the list items to match.
=back
+Additionally, if the value is a C<HASH> reference, it can be used to make a
+sub-part of the structure, as demonstrated in the C<length> key of the example
+C<\%spec> given in L<SYNOPSIS|/SYNOPSIS>.
+
=head1 EXAMPLES
=head2 Collecting troublesome records
@@ -193,18 +200,15 @@ imperfect Perl parser:
"RETURNING CUSTOMER Aaron Carter 2016-05-01"
);
-You could have a bucket structure like this, which catches certain error types
-you've seen before for review:
+You could have a bucket structure like this, using the B<pattern syntax>, which
+catches certain error types you've seen before for review:
my %buckets = (
bad_whitespace => qr/ [\r\t] /msx,
apostrophes => qr/ ' /msx,
double_apostrophes => qr/ '' /msx,
not_ascii => qr/ [^[:ascii:]] /msx
- );
-
-Notice that you don't have to wrap a quoted regular expression to match in a
-C<sub> subroutine reference, as a convenience shortcut.
+ };
Applying the bucket structure like so:
@@ -229,12 +233,12 @@ The result set would look like this:
);
Notice that some of the lines appear in more than one list, and that the
-C<not_ascii> bucket is empty because none of the items matched it.
+C<not_ascii> bucket is empty, because none of the items matched it.
=head2 Monitoring system check results
-Suppose you ran a list of checks with your monitoring system, and you have a
-list of hashrefs describing each check and its outcome:
+Suppose you ran a list of checks with your monitoring system, and now you have
+a list of C<HASH> references with keys describing each check and its outcome:
my @checks = (
{
@@ -255,8 +259,8 @@ list of hashrefs describing each check and its outcome:
}
);
-You would like to break the list down by status. Using C<List::Breakdown>, you
-would lay out your buckets like so:
+You would like to break the list down by status. You would lay out your buckets
+like so, using the B<subroutine syntax>:
my %buckets = (
ok => sub { $_->{status} eq 'OK' },
@@ -271,9 +275,6 @@ And apply them like so:
my %results = breakdown \%buckets, @checks;
-You can then apply C<%buckets> to any other list you may need to check in the
-same way to get the same structure.
-
For our sample data above, this would yield the following structure in
C<%results>:
@@ -305,19 +306,19 @@ C<%results>:
}
)
-Note the extra level of hash referencing beneath the C<problem> key.
+Note the extra level of C<HASH> references beneath the C<problem> key.
=head2 Grouping numbers by size
-Suppose you have a list of stray numbers from your volcanic activity reporting
-system, some of which might be merely worrisome and some an emergency, and they
-need to be filtered to know where to send them:
+Suppose you have a list of numbers from your volcanic activity reporting
+system, some of which might be merely worrisome, and some others an emergency,
+and they need to be filtered to know where to send them:
my @numbers = ( 1, 32, 3718.4, 0x56, 0777, 3.14, -5, 1.2e5 );
-You could filter them into buckets like this, using the interval syntax; an
-array reference with exactly two elements; lower bound (inclusive) first, upper
-bound (exclusive) second:
+You could filter them into buckets like this, using the B<interval syntax>: an
+C<ARRAY> reference with exactly two elements: lower bound (inclusive) first,
+upper bound (exclusive) second:
my $filters = {
negative => [ undef, 0 ],
@@ -354,20 +355,21 @@ Tom Ryder C<< <tom@sanctum.geek.nz> >>
=over 4
-=item HASH reference expected for first argument
+=item C<HASH reference expected for first argument>
-The first argument that B<breakdown()> saw wasn't the hash reference it expects.
+The first argument that C<breakdown()> saw wasn't the hash reference it expects.
That's the only format a spec is allowed to have.
-=item Reference expected for '%s'
+=item C<Reference expected for '%s'>
The value for the named key in the spec was not a reference, and one was
expected.
-=item Unhandled ref type %s for '%s'
+=item C<Unhandled ref type %s for '%s'>
The value for the named key in the spec is of a type that makes no sense to
-this module. Legal reference types are C<HASH>, C<CODE>, and C<Regexp>.
+this module. Legal reference types are C<ARRAY>, C<CODE>, C<HASH>, and
+C<Regexp>.
=back
@@ -381,15 +383,15 @@ Perl 5.6.0 or newer
=item *
-C<base>
+L<base|base>
=item *
-C<Carp>
+L<Carp|Carp>
=item *
-C<Exporter>
+L<Exporter|Exporter>
=back
@@ -408,7 +410,7 @@ requests to C<tom@sanctum.geek.nz>.
=head1 SUPPORT
-You can find documentation for this module with the C<perldoc> command.
+You can find documentation for this module with the B<perldoc> command.
perldoc List::Breakdown
@@ -418,19 +420,19 @@ You can also look for information at:
=item * RT: CPAN's request tracker (report bugs here)
-<http://rt.cpan.org/NoAuth/Bugs.html?Dist=List-Breakdown>
+L<http://rt.cpan.org/NoAuth/Bugs.html?Dist=List-Breakdown>
=item * AnnoCPAN: Annotated CPAN documentation
-<http://annocpan.org/dist/List-Breakdown>
+L<http://annocpan.org/dist/List-Breakdown>
=item * CPAN Ratings
-<http://cpanratings.perl.org/d/List-Breakdown>
+L<http://cpanratings.perl.org/d/List-Breakdown>
=item * Search CPAN
-<http://search.cpan.org/dist/List-Breakdown/>
+L<http://search.cpan.org/dist/List-Breakdown/>
=back
@@ -442,7 +444,7 @@ This program is free software; you can redistribute it and/or modify it under
the terms of the Artistic License (2.0). You may obtain a copy of the full
license at:
-<http://www.perlfoundation.org/artistic_license_2_0>
+L<http://www.perlfoundation.org/artistic_license_2_0>
Any use, modification, and distribution of the Standard or Modified Versions is
governed by this Artistic License. By using, modifying or distributing the