Data::Bucket

Data::Bucket is an indexed data store (bucket hashing).
Download

Data::Bucket Ranking & Summary

Advertisement

  • Rating:
  • License:
  • Perl Artistic License
  • Price:
  • FREE
  • Publisher Name:
  • Terrence M. Brannon
  • Publisher web site:
  • http://search.cpan.org/~tbone/Data-Bucket-0.07/lib/Data/Bucket.pm

Data::Bucket Tags


Data::Bucket Description

Data::Bucket is an indexed data store (bucket hashing). Data::Bucket is an indexed data store (bucket hashing).SYNOPSIS use base qw(Data::Bucket) ; # default storage scheme stores things based on first character of data # we overwrite it in our sub-class # If we return an array-ref, then the data item is stored in each # bucket as opposed to a single. sub compute_record_index { my ($self, $data) = @_; substr(0, 2, $data); } open S, $file_to_be_searched or die $! ; my $bucket = __PACKAGE__->index() ; open I, $file_with_queries or die $! ; for my $line () { my @search_candidates = $bucket->based_on($line); my @score = sort map { fuzzy_match($line, $_) } @search_candidates ; }An example in which a single datum is dumped to multiple buckets sub compute_record_index { my ($self, $data) = @_; return undef unless $data; warn "$data"; my @words = split /s+/, $data ; my $min = min($#words, 1); my @index = map { substr($_, 0, 1) } @words; @index; } for my $search ( qw(oh the so draw apple) ) { my @b = $bucket->based_on($search); # do something which each value in @bucket and $search }An example in which the lookup data differs in structure from input dataYes, there is plenty of room for re-use between the two. But for naive understanding, no refactoring is done. # We compute record indices for bucketing by extracting a field from # input hash reference sub compute_record_index { my ($self, $data) = @_; return undef unless $data; warn "$data"; my @words = split /s+/, $data->{clean_name} ; my $min = min($#words, 2); my @index = map { substr($_, 0, 1) } @words; @index; } # We find out the proper buckets in the input data by using a plain # string field sub retrieve_record_index { my ($self, $data) = @_; return undef unless $data; #warn "$data"; my @words = split /s+/, $data ; my $min = min($#words, 2); my @index = map { substr($_, 0, 1) } @words; if ($data =~ /01LA/) { warn "words", Dumper @words; warn "index", Dumper @index; for (@index) { warn "bucket($_)", Dumper($self->{bucket}{$_}) ; } } @index; } Requirements:· Perl Requirements: · Perl


Data::Bucket Related Software