httpindex
Section: User Commands (1)
Updated: August 2, 2005
Index
Return to Main Contents
NAME
httpindex - HTTP front-end for SWISH++ indexer
SYNOPSIS
wget
[
options
]
URL...
2>&1 | httpindex
[
options
]
DESCRIPTION
httpindex
is a front-end for
index++(1)
to index files copied from remote servers using
wget(1).
The files (in a copy of the remote directory structure)
can be kept, deleted, or replaced with their descriptions after indexing.
OPTIONS
wget Options
The
wget(1)
options that are
required
are:
-A,
-nv,
-r,
and
-x;
the ones that are
highly recommended
are:
-l,
-nh,
-t,
and
-w.
(See the EXAMPLE.)
httpindex Options
httpindex
accepts the same short options as
index++(1)
except for
-H,
-I,
-l,
-r,
-S,
and
-V.
The following options are unique to
httpindex:
- -d
-
Replace the text of local copies of retrieved files with their descriptions
after they have been indexed.
This is useful to display file descriptions in search results
without having to have complete copies of the remote files
thus saving filesystem space.
(See the extract_description() function in
WWW(3)
for details about how descriptions are extracted.)
- -D
-
Delete the local copies of retrieved files after they have been indexed.
This prevents your local filesystem from filling up
with copies of remote files.
EXAMPLE
To index all HTML and text files on a remote web server
keeping descriptions locally:
-
wget -A html,txt -linf -t2 -rxnv -nh -w2 http://www.foo.com 2>&1 |
httpindex -d -e'html:*.html,text:*.txt'
Note that you need to redirect
wget(1)'s
output from standard error to standard output in order
to pipe it to
httpindex.
EXIT STATUS
Exits with a value of zero only if indexing completed sucessfully;
non-zero otherwise.
CAVEATS
In addition to those for
index++(1),
httpindex
does not correctly handle the use of multiple
-e,
-E,
-m,
or
-M
options
(because the Perl script uses the standard GetOpt::Std package
for processing command-line options that doesn't).
The last of any of those options ``wins.''
The work-around is to use multiple values for those options
seperated by commas to a single one of those options.
For example, if you want to do:
-
httpindex -e'html:*.html' -e'text:*.txt'
do this instead:
-
httpindex -e'html:*.html,text:*.txt'
SEE ALSO
index++(1),
wget(1),
WWW(3)
AUTHOR
Paul J. Lucas
<pauljlucas@mac.com>
Index
- NAME
-
- SYNOPSIS
-
- DESCRIPTION
-
- OPTIONS
-
- wget Options
-
- httpindex Options
-
- EXAMPLE
-
- EXIT STATUS
-
- CAVEATS
-
- SEE ALSO
-
- AUTHOR
-
This document was created by
man2html,
using the manual pages.
Time: 17:40:49 GMT, April 26, 2024