libarc

Libarc is a C library for accessing the contents of GZIP compressed ARC files generated by the Heritrix web crawler.
Download

libarc Ranking & Summary

Advertisement

  • Rating:
  • License:
  • GPL
  • Price:
  • FREE
  • Publisher Name:
  • Tom Emerson
  • Publisher web site:

libarc Tags


libarc Description

Libarc is a C library for accessing the contents of GZIP compressed ARC files generated by the Heritrix web crawler. Libarc is a C library for accessing the contents of GZIP compressed ARC files generated by the Internet Archive's Heritrix web crawler. Here are some key features of "libarc": · Opening and scanning the contents of GZIP compressed ARC file. The library does not currently read CDX index files, though this feature will be added in a future release. · You can get an iterator to walk over the contents of the ARC file member by member. You can specify a media type to limit the types members you see. · You can access the information in the member's URL record and the response headers from the HTTP server. · You can access the member's data in a single API call.


libarc Related Software