libextractor-python

GNU libextractor
Log | Files | Refs | README | LICENSE

README (2332B)


      1 Python bindings for GNU libextractor
      2 
      3 About libextractor
      4 ==================
      5 
      6  libextractor is a simple library for keyword extraction.  libextractor
      7  does not support all formats but supports a simple plugging mechanism
      8  such that you can quickly add extractors for additional formats, even
      9  without recompiling libextractor. libextractor typically ships with a
     10  dozen helper-libraries that can be used to obtain keywords from common
     11  file-types.
     12 
     13  libextractor is a part of the GNU project (http://www.gnu.org/).
     14 
     15 Dependencies
     16 ============
     17 
     18  * python >= 2.7
     19     web site: http://www.python.org/
     20 
     21  * libextractor >= 1.6
     22     web site: http://www.gnu.org/software/libextractor/
     23 
     24  * ctypes >= 0.9
     25     web site: http://starship.python.net/crew/theller/ctypes/
     26 
     27  * setuptools (optional)
     28     web site: http://cheeseshop.python.org/pypi/setuptools
     29 
     30 Performances
     31 ============
     32 
     33  Surprisingly the original C native library is only 20% faster than
     34  this python ctypes bindings. Here a quick and dirty bench:
     35 
     36  The C extract on Extractor test files:
     37 
     38  $ time `find Extractor/test -type f -not -name "*.svn*"|xargs extract`
     39 
     40   real    0m0.403s
     41   user    0m0.303s
     42   sys     0m0.061s
     43 
     44  Same data with the ctypes python bindings:
     45 
     46  $ time `find Extractor/test -type f -not -name "*.svn*"|xargs extract.py`
     47 
     48   real    0m0.661s
     49   user    0m0.529s
     50   sys     0m0.074s
     51 
     52 Install
     53 =======
     54 
     55  Using the tarball (as root):
     56  # python setup.py install
     57 
     58 Copyright
     59 =========
     60 
     61  Copyright (C) 2006 Bader Ladjemi <bader@tele2.fr>
     62  Copyright (C) 2011 Christian Grothoff <christian@grothoff.org>
     63  Copyright (C) 2017, 2018 Nikita Gillmann <nikita@n0.is>
     64 
     65  This program is free software; you can redistribute it and/or modify
     66  it under the terms of the GNU General Public License as published by
     67  the Free Software Foundation; either version 3 of the License, or
     68  (at your option) any later version.
     69 
     70  This program is distributed in the hope that it will be useful,
     71  but WITHOUT ANY WARRANTY; without even the implied warranty of
     72  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
     73  GNU General Public License for more details.
     74 
     75  You should have received a copy of the GNU General Public License
     76  along with this program; if not, write to the Free Software
     77  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
     78 
     79  see COPYING for details
     80