dwww Home | Manual pages | Find package

httpindex(1)                General Commands Manual               httpindex(1)

NAME
       httpindex - HTTP front-end for SWISH++ indexer

SYNOPSIS
       wget [ options ] URL...  2>&1 | httpindex [ options ]

DESCRIPTION
       httpindex  is a front-end for index++(1) to index files copied from re-
       mote servers using wget(1).  The files (in a copy of the remote  direc-
       tory  structure)  can be kept, deleted, or replaced with their descrip-
       tions after indexing.

OPTIONS
   wget Options
       The wget(1) options that are required are: -A, -nv,  -r,  and  -x;  the
       ones  that  are  highly recommended are: -l, -nh, -t, and -w.  (See the
       EXAMPLE.)

   httpindex Options
       httpindex accepts the same short options as index++(1) except  for  -H,
       -I, -l, -r, -S, and -V.

       The following options are unique to httpindex:

       -d     Replace  the  text of local copies of retrieved files with their
              descriptions after they have been indexed.  This  is  useful  to
              display  file  descriptions  in search results without having to
              have complete copies of the remote files thus saving  filesystem
              space.   (See  the  extract_description() function in WWW(3) for
              details about how descriptions are extracted.)

       -D     Delete the local copies of retrieved files after they have  been
              indexed.   This  prevents  your local filesystem from filling up
              with copies of remote files.

EXAMPLE
       To index all HTML and text files on a remote  web  server  keeping  de-
       scriptions locally:

            wget -A html,txt -linf -t2 -rxnv -nh -w2 http://www.foo.com 2>&1 |
            httpindex -d -e'html:*.html,text:*.txt'

       Note  that you need to redirect wget(1)'s output from standard error to
       standard output in order to pipe it to httpindex.

EXIT STATUS
       Exits with a value of zero only if indexing completed sucessfully; non-
       zero otherwise.

CAVEATS
       In  addition to those for index++(1), httpindex does not correctly han-
       dle the use of multiple -e, -E, -m, or -M  options  (because  the  Perl
       script  uses  the  standard GetOpt::Std package for processing command-
       line options that doesn't).  The last of any of those options ``wins.''

       The work-around is to use multiple values for those  options  seperated
       by  commas  to a single one of those options.  For example, if you want
       to do:

            httpindex -e'html:*.html' -e'text:*.txt'

       do this instead:

            httpindex -e'html:*.html,text:*.txt'

SEE ALSO
       index++(1), wget(1), WWW(3)

AUTHOR
       Paul J. Lucas <pauljlucas@mac.com>

SWISH++                         August 2, 2005                    httpindex(1)

Generated by dwww version 1.15 on Sun Jun 23 21:26:37 CEST 2024.