Data::BucketData::Bucket is an indexed data store (bucket hashing). | |
Download |
Data::Bucket Ranking & Summary
Advertisement
- License:
- Perl Artistic License
- Price:
- FREE
- Publisher Name:
- Terrence M. Brannon
- Publisher web site:
- http://search.cpan.org/~tbone/Data-Bucket-0.07/lib/Data/Bucket.pm
Data::Bucket Tags
Data::Bucket Description
Data::Bucket is an indexed data store (bucket hashing). Data::Bucket is an indexed data store (bucket hashing).SYNOPSIS use base qw(Data::Bucket) ; # default storage scheme stores things based on first character of data # we overwrite it in our sub-class # If we return an array-ref, then the data item is stored in each # bucket as opposed to a single. sub compute_record_index { my ($self, $data) = @_; substr(0, 2, $data); } open S, $file_to_be_searched or die $! ; my $bucket = __PACKAGE__->index() ; open I, $file_with_queries or die $! ; for my $line () { my @search_candidates = $bucket->based_on($line); my @score = sort map { fuzzy_match($line, $_) } @search_candidates ; }An example in which a single datum is dumped to multiple buckets sub compute_record_index { my ($self, $data) = @_; return undef unless $data; warn "$data"; my @words = split /s+/, $data ; my $min = min($#words, 1); my @index = map { substr($_, 0, 1) } @words; @index; } for my $search ( qw(oh the so draw apple) ) { my @b = $bucket->based_on($search); # do something which each value in @bucket and $search }An example in which the lookup data differs in structure from input dataYes, there is plenty of room for re-use between the two. But for naive understanding, no refactoring is done. # We compute record indices for bucketing by extracting a field from # input hash reference sub compute_record_index { my ($self, $data) = @_; return undef unless $data; warn "$data"; my @words = split /s+/, $data->{clean_name} ; my $min = min($#words, 2); my @index = map { substr($_, 0, 1) } @words; @index; } # We find out the proper buckets in the input data by using a plain # string field sub retrieve_record_index { my ($self, $data) = @_; return undef unless $data; #warn "$data"; my @words = split /s+/, $data ; my $min = min($#words, 2); my @index = map { substr($_, 0, 1) } @words; if ($data =~ /01LA/) { warn "words", Dumper @words; warn "index", Dumper @index; for (@index) { warn "bucket($_)", Dumper($self->{bucket}{$_}) ; } } @index; } Requirements:· Perl Requirements: · Perl
Data::Bucket Related Software