aboutsummaryrefslogtreecommitdiff
path: root/doc/extractor.texi
diff options
context:
space:
mode:
Diffstat (limited to 'doc/extractor.texi')
-rw-r--r--doc/extractor.texi212
1 files changed, 164 insertions, 48 deletions
diff --git a/doc/extractor.texi b/doc/extractor.texi
index d382aed..4bf6743 100644
--- a/doc/extractor.texi
+++ b/doc/extractor.texi
@@ -10,8 +10,10 @@
10@c %**end of header 10@c %**end of header
11@copying 11@copying
12This manual is for GNU libextractor 12This manual is for GNU libextractor
13(version @value{VERSION}, @value{UPDATED}), 13(version @value{VERSION}, @value{UPDATED}).
14which is GNU's library for meta data extraction. 14
15GNU libextractor is a GNU package.
16
15 17
16Copyright @copyright{} 2007, 2010 Christian Grothoff 18Copyright @copyright{} 2007, 2010 Christian Grothoff
17 19
@@ -73,7 +75,7 @@ Free Documentation License".
73@code{NULL} 75@code{NULL}
74@end macro 76@end macro
75 77
76@macro le{} 78@macro gnule{}
77@acronym{GNU libextractor} 79@acronym{GNU libextractor}
78@end macro 80@end macro
79 81
@@ -84,24 +86,22 @@ Free Documentation License".
84@insertcopying 86@insertcopying
85@end ifnottex 87@end ifnottex
86 88
87GNU libextractor is a GNU package.
88
89@menu 89@menu
90* Introduction:: What is @le{}. 90* Introduction:: What is @gnule{}.
91* Preparation:: What you should do before using the library. 91* Preparation:: What you should do before using the library.
92* Generalities:: General library functions and data types. 92* Generalities:: General library functions and data types.
93* Extracting meta data:: How to use @le{} to obtain meta data. 93* Extracting meta data:: How to use @gnule{} to obtain meta data.
94* Language bindings:: How to use @le{} from languages other than C. 94* Language bindings:: How to use @gnule{} from languages other than C.
95* Utility functions:: Utility functions of @le{}. 95* Utility functions:: Utility functions of @gnule{}.
96* Existing Plugins:: What plugins are available. 96* Existing Plugins:: What plugins are available.
97* Writing new Plugins:: How to write new plugins for @le{}. 97* Writing new Plugins:: How to write new plugins for @gnule{}.
98* Internal utility functions:: Utility functions of @le{} for writing plugins. 98* Internal utility functions:: Utility functions of @gnule{} for writing plugins.
99* Reporting bugs:: How to report bugs or request new features. 99* Reporting bugs:: How to report bugs or request new features.
100 100
101Appendices 101Appendices
102 102
103* Copying:: The GNU General Public License says how you 103* Copying:: The GNU General Public License says how you
104 can copy and share some parts of @le{}. 104 can copy and share some parts of @gnule{}.
105 105
106Indices 106Indices
107 107
@@ -120,7 +120,7 @@ Indices
120@chapter Introduction 120@chapter Introduction
121 121
122@cindex error handling 122@cindex error handling
123@le{} is GNU's library for extracting meta data from 123@gnule{} is GNU's library for extracting meta data from
124files. Meta data includes format information (such as mime type, 124files. Meta data includes format information (such as mime type,
125image dimensions, color depth, recording frequency), content 125image dimensions, color depth, recording frequency), content
126descriptions (such as document title or document description) and 126descriptions (such as document title or document description) and
@@ -128,55 +128,55 @@ copyright information (such as license, author and contributors).
128Meta data extraction is an inherently uncertain business --- a parse 128Meta data extraction is an inherently uncertain business --- a parse
129error can be a corrupt file, an incompatibility in the file format 129error can be a corrupt file, an incompatibility in the file format
130version, an entirely different file format or a bug in the parser. As 130version, an entirely different file format or a bug in the parser. As
131a result of this uncertainty, @le{} deliberately 131a result of this uncertainty, @gnule{} deliberately
132avoids to ever report any errors. Unexpected file contents simply 132avoids to ever report any errors. Unexpected file contents simply
133result in less or possibly no meta data being extracted. 133result in less or possibly no meta data being extracted.
134 134
135@cindex plugin 135@cindex plugin
136@le{} uses plugins to handle various file formats. 136@gnule{} uses plugins to handle various file formats.
137Technically a plugin can support multiple file formats; however, most 137Technically a plugin can support multiple file formats; however, most
138plugins only support one particular format. By default, 138plugins only support one particular format. By default,
139@le{} will use all plugins that are available and found 139@gnule{} will use all plugins that are available and found
140in the plugin installation directory. Applications can 140in the plugin installation directory. Applications can
141request the use of only specific plugins or the exclusion of 141request the use of only specific plugins or the exclusion of
142certain plugins. 142certain plugins.
143 143
144@le{} is distributed with the @command{extract} 144@gnule{} is distributed with the @command{extract}
145command@footnote{Some distributions ship @command{extract} in a 145command@footnote{Some distributions ship @command{extract} in a
146seperate package.} which is a command-line tool for extracting 146seperate package.} which is a command-line tool for extracting
147meta data. @command{extract} is given a list of filenames and 147meta data. @command{extract} is given a list of filenames and
148prints the resulting meta data to the console. The @command{extract} 148prints the resulting meta data to the console. The @command{extract}
149source code also serves as an advanced example for how to use 149source code also serves as an advanced example for how to use
150@le{}. 150@gnule{}.
151 151
152This manual focuses on providing documentation for writing software 152This manual focuses on providing documentation for writing software
153with @le{}. The only relevant parts for end-users 153with @gnule{}. The only relevant parts for end-users
154are the chapter on compiling and installing @le{} 154are the chapter on compiling and installing @gnule{}
155(@xref{Preparation}.). Also, the chapter on existing plugins maybe of 155(@xref{Preparation}.). Also, the chapter on existing plugins maybe of
156interest (@xref{Existing Plugins}.). Additional documentation for 156interest (@xref{Existing Plugins}.). Additional documentation for
157end-users can be find in the man page on @command{extract} (using 157end-users can be find in the man page on @command{extract} (using
158@verb{|man extract|}). 158@verb{|man extract|}).
159 159
160@cindex license 160@cindex license
161@le{} is licensed under the GNU General Public License. The 161@gnule{} is licensed under the GNU General Public License. The
162developers have frequently received requests to license GNU 162developers have frequently received requests to license GNU
163libextractor under alternative terms. However, @le{} 163libextractor under alternative terms. However, @gnule{}
164borrows plenty of GPL-licensed code from various other projects. 164borrows plenty of GPL-licensed code from various other projects.
165Hence we cannot change the license (even if we wanted to).@footnote{It 165Hence we cannot change the license (even if we wanted to).@footnote{It
166maybe possible to switch to GPLv3 in the future. For this, an audit 166maybe possible to switch to GPLv3 in the future. For this, an audit
167of the license status of our dependencies would be required. The new 167of the license status of our dependencies would be required. The new
168code that was developed specifically for @le{} has 168code that was developed specifically for @gnule{} has
169always been licensed under GPLv2 @emph{or any later version}.} 169always been licensed under GPLv2 @emph{or any later version}.}
170 170
171@node Preparation 171@node Preparation
172@chapter Preparation 172@chapter Preparation
173 173
174Compiling @le{} follows the standard GNU autotools 174Compiling @gnule{} follows the standard GNU autotools
175build process using @command{configure} and @command{make}. For 175build process using @command{configure} and @command{make}. For
176details, read the @file{INSTALL} file and query 176details, read the @file{INSTALL} file and query
177@verb{|./configure --help|} for additional options. 177@verb{|./configure --help|} for additional options.
178 178
179@le{} has various dependencies, some of which are optional. 179@gnule{} has various dependencies, some of which are optional.
180Instead of specifying the names of the software packages, we 180Instead of specifying the names of the software packages, we
181will give the list in terms of the names of the respective 181will give the list in terms of the names of the respective
182Debian (unstable) packages that should be installed. 182Debian (unstable) packages that should be installed.
@@ -246,29 +246,29 @@ Please notify us if we missed some dependencies (note that the list is
246supposed to only list direct dependencies, not transitive 246supposed to only list direct dependencies, not transitive
247dependencies). 247dependencies).
248 248
249Once you have compiled and installed @le{}, you should have a file 249Once you have compiled and installed @gnule{}, you should have a file
250@file{extractor.h} installed in your @file{include/} directory. This 250@file{extractor.h} installed in your @file{include/} directory. This
251file should be the starting point for your C and C++ development with 251file should be the starting point for your C and C++ development with
252@le{}. The build process also installs the @file{extract} binary and 252@gnule{}. The build process also installs the @file{extract} binary and
253man pages for @file{extract} and @le{}. The @file{extract} man page 253man pages for @file{extract} and @gnule{}. The @file{extract} man page
254documents the @file{extract} tool. The @le{} man page gives a brief 254documents the @file{extract} tool. The @gnule{} man page gives a brief
255summary of the C API for @le{}. 255summary of the C API for @gnule{}.
256 256
257@cindex packageing 257@cindex packageing
258@cindex directory structure 258@cindex directory structure
259@cindex plugin 259@cindex plugin
260@cindex environment variables 260@cindex environment variables
261@vindex LIBEXTRACTOR_PREFIX 261@vindex LIBEXTRACTOR_PREFIX
262When you install @le{}, various plugins will be 262When you install @gnule{}, various plugins will be
263installed in the @file{lib/libextractor/} directory. The main library 263installed in the @file{lib/libextractor/} directory. The main library
264will be installed as @file{lib/libextractor.so}. Note that 264will be installed as @file{lib/libextractor.so}. Note that
265@le{} will attempt to find the plugins relative to the 265@gnule{} will attempt to find the plugins relative to the
266path of the main library. Consequently, a package manager can move 266path of the main library. Consequently, a package manager can move
267the library and its plugins to a different location later --- as long 267the library and its plugins to a different location later --- as long
268as the relative path between the main library and the plugins is 268as the relative path between the main library and the plugins is
269preserved. As a method of last resort, the user can specify an 269preserved. As a method of last resort, the user can specify an
270environment variable @verb{|LIBEXTRACTOR_PREFIX|}. If 270environment variable @verb{|LIBEXTRACTOR_PREFIX|}. If
271@le{} cannot locate a plugin, it will look in 271@gnule{} cannot locate a plugin, it will look in
272@verb{|LIBEXTRACTOR_PREFIX/lib/libextractor/|}. 272@verb{|LIBEXTRACTOR_PREFIX/lib/libextractor/|}.
273 273
274@section Note to package maintainers 274@section Note to package maintainers
@@ -304,9 +304,9 @@ resources.
304@node Generalities 304@node Generalities
305@chapter Generalities 305@chapter Generalities
306 306
307Each public symbol exported by @le{} has the prefix 307Each public symbol exported by @gnule{} has the prefix
308@verb{|EXTRACTOR_|}. All-caps names are used for constants. For the 308@verb{|EXTRACTOR_|}. All-caps names are used for constants. For the
309impatient, the minimal C code for using @le{} (on the 309impatient, the minimal C code for using @gnule{} (on the
310executing binary itself) looks like this: 310executing binary itself) looks like this:
311 311
312@verbatim 312@verbatim
@@ -326,6 +326,13 @@ int main(int argc, char ** argv) {
326@node Extracting meta data 326@node Extracting meta data
327@chapter Extracting meta data 327@chapter Extracting meta data
328 328
329In order to extract meta data with @gnule{} you first need to
330load the respective plugins and then call the extraction API
331with the plugins and the data to process. This section
332documents how to load and unload plugins, the various types
333and formats in which meta data is returned to the application
334and finally the extraction API itself.
335
329@menu 336@menu
330* Plugin management:: How to load and unload plugins 337* Plugin management:: How to load and unload plugins
331* Meta types:: About meta types 338* Meta types:: About meta types
@@ -350,7 +357,7 @@ from multiple threads at the same time is not safe. Creating multiple
350plugin lists and using them concurrently is supported as long as 357plugin lists and using them concurrently is supported as long as
351the @code{EXTRACTOR_OPTION_IN_PROCESS} option is not used. 358the @code{EXTRACTOR_OPTION_IN_PROCESS} option is not used.
352 359
353Generally, @le{} is fully thread-safe and mostly reentrant. 360Generally, @gnule{} is fully thread-safe and mostly reentrant.
354All plugin code is expected required to be reentrant and state-less, 361All plugin code is expected required to be reentrant and state-less,
355but due to the extensive use of 3rd party libraries this cannot 362but due to the extensive use of 3rd party libraries this cannot
356be guaranteed. Hence plugins are executed (by default) out of 363be guaranteed. Hence plugins are executed (by default) out of
@@ -402,7 +409,7 @@ Loads and unloads plugins based on a configuration string, modifying the existin
402@deftypefun {struct EXTRACTOR_PluginList *} EXTRACTOR_plugin_add_defaults (enum EXTRACTOR_Options flags) 409@deftypefun {struct EXTRACTOR_PluginList *} EXTRACTOR_plugin_add_defaults (enum EXTRACTOR_Options flags)
403@findex EXTRACTOR_plugin_add_defaults 410@findex EXTRACTOR_plugin_add_defaults
404 411
405Loads all of the plugins in the plugin directory. This function is what most @le{} applications should use to setup the plugins. 412Loads all of the plugins in the plugin directory. This function is what most @gnule{} applications should use to setup the plugins.
406@end deftypefun 413@end deftypefun
407 414
408 415
@@ -414,14 +421,14 @@ Loads all of the plugins in the plugin directory. This function is what most @l
414@tindex enum EXTRACTOR_MetaType 421@tindex enum EXTRACTOR_MetaType
415@findex EXTRACTOR_metatype_get_max 422@findex EXTRACTOR_metatype_get_max
416 423
417@verb{|enum EXTRACTOR_MetaType|} is a C enum which defines a list of over 100 different types of meta data. The total number can differ between different @le{} releases; the maximum value for the current release can be obtained using the @verb{|EXTRACTOR_metatype_get_max|} function. All values in this enumeration are of the form @verb{|EXTRACTOR_METATYPE_XXX|}. 424@verb{|enum EXTRACTOR_MetaType|} is a C enum which defines a list of over 100 different types of meta data. The total number can differ between different @gnule{} releases; the maximum value for the current release can be obtained using the @verb{|EXTRACTOR_metatype_get_max|} function. All values in this enumeration are of the form @verb{|EXTRACTOR_METATYPE_XXX|}.
418 425
419@deftypefun {const char *} EXTRACTOR_metatype_to_string (enum EXTRACTOR_MetaType type) 426@deftypefun {const char *} EXTRACTOR_metatype_to_string (enum EXTRACTOR_MetaType type)
420@findex EXTRACTOR_metatype_to_string 427@findex EXTRACTOR_metatype_to_string
421@cindex gettext 428@cindex gettext
422@cindex internationalization 429@cindex internationalization
423 430
424The function @verb{|EXTRACTOR_metatype_to_string|} can be used to obtain a short English string @samp{s} describing the meta data type. The string can be translated into other languages using GNU gettext with the domain set to @le{} (@verb{|dgettext("libextractor", s)|}). 431The function @verb{|EXTRACTOR_metatype_to_string|} can be used to obtain a short English string @samp{s} describing the meta data type. The string can be translated into other languages using GNU gettext with the domain set to @gnule{} (@verb{|dgettext("libextractor", s)|}).
425@end deftypefun 432@end deftypefun
426 433
427@deftypefun {const char *} EXTRACTOR_metatype_to_description (enum EXTRACTOR_MetaType type) 434@deftypefun {const char *} EXTRACTOR_metatype_to_description (enum EXTRACTOR_MetaType type)
@@ -429,7 +436,7 @@ The function @verb{|EXTRACTOR_metatype_to_string|} can be used to obtain a short
429@cindex gettext 436@cindex gettext
430@cindex internationalization 437@cindex internationalization
431 438
432The function @verb{|EXTRACTOR_metatype_to_description|} can be used to obtain a longer English string @samp{s} describing the meta data type. The description may be empty if the short description returned by @code{EXTRACTOR_metatype_to_string} is already comprehensive. The string can be translated into other languages using GNU gettext with the domain set to @le{} (@verb{|dgettext("libextractor", s)|}). 439The function @verb{|EXTRACTOR_metatype_to_description|} can be used to obtain a longer English string @samp{s} describing the meta data type. The description may be empty if the short description returned by @code{EXTRACTOR_metatype_to_string} is already comprehensive. The string can be translated into other languages using GNU gettext with the domain set to @gnule{} (@verb{|dgettext("libextractor", s)|}).
433@end deftypefun 440@end deftypefun
434 441
435 442
@@ -490,11 +497,11 @@ Return 0 to continue extracting, 1 to abort.
490@cindex threads 497@cindex threads
491@cindex thread-safety 498@cindex thread-safety
492 499
493This is the main function for extracting keywords with @le{}. The first argument is a plugin list which specifies the set of plugins that should be used for extracting meta data. The @samp{filename} argument is optional and can be used to specify the name of a file to process. If @samp{filename} is NULL, then the @samp{data} argument must point to the in-memory data to extract meta data from. If @samp{filename} is non-NULL, @samp{data} can be NULL. If @samp{data} is non-null, then @samp{size} is the size of @samp{data} in bytes. Otherwise @samp{size} should be zero. For each meta data item found, GNU libextractor will call the @samp{proc} function, passing @samp{proc_cls} as the first argument to @samp{proc}. The other arguments to @samp{proc} depend on the specific meta data found. 500This is the main function for extracting keywords with @gnule{}. The first argument is a plugin list which specifies the set of plugins that should be used for extracting meta data. The @samp{filename} argument is optional and can be used to specify the name of a file to process. If @samp{filename} is NULL, then the @samp{data} argument must point to the in-memory data to extract meta data from. If @samp{filename} is non-NULL, @samp{data} can be NULL. If @samp{data} is non-null, then @samp{size} is the size of @samp{data} in bytes. Otherwise @samp{size} should be zero. For each meta data item found, GNU libextractor will call the @samp{proc} function, passing @samp{proc_cls} as the first argument to @samp{proc}. The other arguments to @samp{proc} depend on the specific meta data found.
494 501
495@cindex SIGBUS 502@cindex SIGBUS
496@cindex bus error 503@cindex bus error
497Meta data extraction should never really fail --- at worst, @le{} should not call @samp{proc} with any meta data. By design, @le{} should never crash or leak memory, even given corrupt files as input. Note however, that running @le{} on a corrupt file system (or incorrectly @verb{|mmap|}ed files) can result in the operating system sending a SIGBUS (bus error) to the process. While @le{} runs plugins out-of-process, it first maps the file into memory and then attempts to decompress it. During decompression it is possible to encounter a SIGBUS. @le{} will @emph{not} attempt to catch this signal and your application is likely to crash. Note again that this should only happen if the file @emph{system} is corrupt (not if individual files are corrupt). If this is not acceptable, you might want to consider running @le{} itself also out-of-process (as done, for example, by @url{http://grothoff.org/christian/doodle/,doodle}). 504Meta data extraction should never really fail --- at worst, @gnule{} should not call @samp{proc} with any meta data. By design, @gnule{} should never crash or leak memory, even given corrupt files as input. Note however, that running @gnule{} on a corrupt file system (or incorrectly @verb{|mmap|}ed files) can result in the operating system sending a SIGBUS (bus error) to the process. While @gnule{} runs plugins out-of-process, it first maps the file into memory and then attempts to decompress it. During decompression it is possible to encounter a SIGBUS. @gnule{} will @emph{not} attempt to catch this signal and your application is likely to crash. Note again that this should only happen if the file @emph{system} is corrupt (not if individual files are corrupt). If this is not acceptable, you might want to consider running @gnule{} itself also out-of-process (as done, for example, by @url{http://grothoff.org/christian/doodle/,doodle}).
498 505
499@end deftypefun 506@end deftypefun
500 507
@@ -509,7 +516,7 @@ Meta data extraction should never really fail --- at worst, @le{} should not cal
509@cindex PHP 516@cindex PHP
510@cindex Ruby 517@cindex Ruby
511 518
512@le{} works immediately with C and C++ code. Bindings for Java, Mono, Ruby, Perl, PHP and Python are available for download from the main @le{} website. Documentation for these bindings (if available) is part of the downloads for the respective binding. In all cases, a full installation of the C library is required before the binding can be installed. 519@gnule{} works immediately with C and C++ code. Bindings for Java, Mono, Ruby, Perl, PHP and Python are available for download from the main @gnule{} website. Documentation for these bindings (if available) is part of the downloads for the respective binding. In all cases, a full installation of the C library is required before the binding can be installed.
513 520
514@section Java 521@section Java
515 522
@@ -571,7 +578,7 @@ This binding is undocumented at this point.
571@cindex concurrency 578@cindex concurrency
572@cindex threads 579@cindex threads
573@cindex thread-safety 580@cindex thread-safety
574This chapter describes various utility functions for @le{} usage. All of the functions are reentrant. 581This chapter describes various utility functions for @gnule{} usage. All of the functions are reentrant.
575 582
576@menu 583@menu
577* Utility Constants:: 584* Utility Constants::
@@ -724,6 +731,115 @@ in-process (making it easier to debug) and without any of the other
724plugins. 731plugins.
725 732
726 733
734@section Example for a minimal extract method
735
736The following example shows how a plugin can return the mime type of
737a file.
738@example
739
740int
741EXTRACTOR_mymime_extract
742 (const char *data,
743 size_t data_size,
744 EXTRACTOR_MetaDataProcessor proc,
745 void *proc_cls,
746 const char * options)
747{
748 if (data_size < 4)
749 return 0;
750 if (0 != memcmp (data, "\177ELF", 4))
751 return 0;
752 if (0 != proc (proc_cls,
753 "mymime",
754 EXTRACTOR_METATYPE_MIMETYPE,
755 EXTRACTOR_METAFORMAT_UTF8,
756 "text/plain",
757 "application/x-executable",
758 1 + strlen("application/x-executable")))
759 return 1;
760 /* more calls to 'proc' here as needed */
761 return 0;
762}
763
764@end example
765
766@section Plugin execution options
767
768Plugins can request that their execution be done in a particular way.
769For this, the plugin defines a function with the following signature:
770
771@verbatim
772const char *
773EXTRACTOR_XXX_options (void);
774@end verbatim
775
776The function should return a string with the execution options.
777Individual options in this string should be separated by semicolons.
778Options that are included in the string but not known to the library
779are ignored. The following options are supported:
780
781@itemize @bullet
782@item
783@code{oop-only} ensures that the plugin is only run out-of-process; if
784this is not possible, the plugin will not be executed at all if this
785option is set.
786
787@item
788@code{close-stderr} ensures that @code{stderr} is closed during the
789execution of the plugin. This is useful if the plugin uses libraries
790that write (error) messages to @code{stderr} and where this behavior cannot be
791turned off. This option only works if the plugin is executed out-of-process.
792
793@item
794@code{close-stdout} ensures that @code{stdout} is closed during the
795execution of the plugin. This is useful if the plugin uses libraries
796that write messages to @code{stdout} and where this behavior cannot be
797turned off. This option only works if the plugin is executed out-of-process.
798
799@item
800@code{force-kill} kills and restarts the plugin process for each
801file that is being analyzed. This is useful if the plugin uses
802libraries that keep global state between runs that is problematic or
803if the plugin uses libraries that are known to have serious resource
804leaks (such as memory leaks).
805
806@item
807@code{want-tail}
808In order to limit memory consumption, limit the amount if reading from
809disk and to keep the API simple, the @samp{data} argument passed to
810the @code{EXTRACTOR_XXX_extract} method bounded (to 32 MB of normal
811data; for compressed data, a limit of 16 MB is imposed).@footnote{If
812@gnule{} was given a pointer to an existing, uncompressed block of
813data in memory, no bound is imposed for plugins executing in-process;
814for out-of-process plugins, a 32 MB limit is still imposed.} Since
815some file formats contain meta data at the end of the file, this option
816provides a way for plugins to access not the first 16--32 MB of a file
817but instead the last (roughly) 32 MB.
818
819Note that even for files larger than 32 MB, @samp{size} is not
820guaranteed to be 32 MB since @samp{data} will be aligned to the page
821size of the operating system. However, the last byte of @samp{data}
822is guaranteed to be the last byte of the file. Furthermore, if the
823file was large and compressed, unlike in the case of meta data
824extraction from the header, the end of the file will not be
825automatically decompressed by @gnule{}.
826
827@end itemize
828
829Note that using options other than @code{want-tail} is pretty much
830always a kludge and should thus be avoided.
831
832@section Example for an options method
833
834The following example shows how a plugin can set some of the options listed above:
835@example
836const char *
837EXTRACTOR_id3_options ()
838{
839 return "close-stderr;want-tail";
840}
841@end example
842
727@node Internal utility functions 843@node Internal utility functions
728@chapter Internal utility functions 844@chapter Internal utility functions
729 845
@@ -752,12 +868,12 @@ below.
752@cindex UTF-8 868@cindex UTF-8
753@cindex character set 869@cindex character set
754@findex EXTRACTOR_common_convert_to_utf8 870@findex EXTRACTOR_common_convert_to_utf8
755Various @le{} plugins make use of the internal 871Various @gnule{} plugins make use of the internal
756@file{convert.h} header which defines a function 872@file{convert.h} header which defines a function
757 873
758@verb{|EXTRACTOR_common_convert_to_utf8|} which can be used to easily convert text from 874@verb{|EXTRACTOR_common_convert_to_utf8|} which can be used to easily convert text from
759any character set to UTF-8. This conversion is important since the 875any character set to UTF-8. This conversion is important since the
760linked list of keywords that is returned by @le{} is 876linked list of keywords that is returned by @gnule{} is
761expected to contain only UTF-8 strings. Naturally, proper conversion 877expected to contain only UTF-8 strings. Naturally, proper conversion
762may not always be possible since some file formats fail to specify the 878may not always be possible since some file formats fail to specify the
763character set. In that case, it is often better to not convert at 879character set. In that case, it is often better to not convert at
@@ -781,9 +897,9 @@ caller, so storing the string in the keyword list is acceptable.
781@chapter Reporting bugs 897@chapter Reporting bugs
782 898
783@cindex bug 899@cindex bug
784@le{} uses the @url{http://gnunet.org/bugs/,Mantis bugtracking 900@gnule{} uses the @url{http://gnunet.org/bugs/,Mantis bugtracking
785system}. If possible, please report bugs there. You can also e-mail 901system}. If possible, please report bugs there. You can also e-mail
786the @le{} mailinglist at @url{libextractor@@gnu.org}. 902the @gnule{} mailinglist at @url{libextractor@@gnu.org}.
787 903
788 904
789 905