aboutsummaryrefslogtreecommitdiff
path: root/README
blob: 7fc7d7990c548720903c3ec18a892849c0a9f0e7 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
About
=====

GNU libextractor is a simple C library for keyword extraction. 
Common use-cases for GNU libextractor include detail-views in
file managers, detailed search results in file-sharing networks 
and general information gathering in forensics investigations
and penetration testing.

Bindings for GNU libextractor exists for many languages in addition to
the standard C/C++ API (we know about bindings for Java, Perl, PHP,
Mono, Python, Ruby).  

libextractor uses a plugin mechanism to enable developers to quickly
add extractors for additional formats.  Plugins are executed
out-of-process and can thus bugs in them (or the libraries that they
use) cannot crash the main application.  libextractor typically ships
with a few dozen plugins that can be used to obtain keywords from
common file types.

More detailed documentation is available in the GNU libextractor
manual.  libextractor is an official GNU package and available from
http://www.gnu.org/s/libextractor/.


extract
=======

extract is a simple command-line interface to GNU libextractor.


Dependencies
============

* GNU C/C++ compiler
* libltdl 2.2.x (from GNU libtool)
* GNU libtool 2.2 or higher
* GNU gettext

The following dependencies are all optional, but should be
available in order for maximum coverage:

* libarchive
* libavutil / libavformat / libavcodec / libswscale (ffmpeg)
* libbz2 (bzip2)
* libexiv2
* libflac
* libgif (giflib)
* libglib (glib)
* libgtk+
* libgsf
* libgstreamer
* libjpeg
* libmagic (file)
* libmpeg2
* librpm
* libtidy
* libtiff
* libvorbis / libogg
* libz (zlib)

When building libextractor binaries, please make sure all of these
dependencies are available and configure detects a sufficiently recent
installation.  Otherwise the build system may automatically build only
a subset of GNU libextractor resulting in mediocre meta data
production.

Finally, 'zzuf' is a fuzzing tool that can optionally be detected by
the build system and be used for debugging / testing.  It is not required
at runtime or for normal builds.