openxmllib

Processing OpenXML documents with Python
Download

openxmllib Ranking & Summary

Advertisement

  • Rating:
  • License:
  • GPL
  • Price:
  • FREE
  • Publisher Name:
  • Gilles Lenfant
  • Publisher web site:
  • http://code.google.com/u/gilles.lenfant/

openxmllib Tags


openxmllib Description

Processing OpenXML documents with Python openxmllib provides resources to handle OpenXML documents from Python.OpenXML is the new office document format supported natively by MS Office 2007, and as import/export format by Apple iWork'08 and OpenOffice 2.2.OpenXML is defined in the ECMA-376 standard.This library runs on any platform that supports Python 2.4 and the lxml Python library.Installation (from tarball)Get the tarball with your browser, wget, curl or whatever, then: $ tar xfz openxmllib-x.y.y.tar.gz $ cd openxmllib-x.y.z $ sudo python setup.py installInstallation from CheeseshopEven easier tha the tarball: $ easy_install openxmllibUsage: >>> import openxmllib >>> doc = openxmllib.openXmlDocument('office.docx') >>> # Raises a ValueError on not supported office files. >>> doc.mimeType 'application/vnd.openxmlformats-officedocument.wordprocessingml.document' >>> doc.coreProperties # Keys may depend on application {'title': u'blah...', u'creator': u'John Doe', ...} >>> doc.extendedProperties # Keys may depend on application {'Words': u'312', 'Application': u'Your favorite word processor', ...} >>> doc.customProperties # May return an empty mapping {'My property': u'My value', ...} >>> doc.allProperties # Merges core+extended+custom properties (see above) {...} >>> doc.indexableText(include_properties=False) u'all the words of that document body' >>> doc.indexableText(include_properties=True) u'all the words of that document body and all properties values' Here are some key features of "openxmllib": · Extract indexable text from an OpenXML document. · Extract meta data from an OpenXML document (author, title, ...) Requirements: · Python · lxml What's New in This Release: · The bug of mid word style change is still not fixed in presentation and spreadsheets. Anyway, we needed an API sanitazation. · Factory API changed for a safer and faster document object construction. · Added support for new mime types that are not in the standard mimetypes module.


openxmllib Related Software