File: README.pcd2html

package info (click to toggle)
pcd2html 0.5.1.1-1
links: PTS
area: main
in suites: squeeze
size: 384 kB
ctags: 56
sloc: perl: 1,405; makefile: 120; sh: 69
file content (336 lines) | stat: -rw-r--r-- 13,325 bytes
parent folder | download | duplicates (4)
PCD2HTML readme

pcd2html is a collection of scripts and Makefiles to convert
Kodak Photo CD data into documented HTML pages.

The user has to ship several files called "rules" to control
in which way the images should be converted into JPG files.

It depends from convert which is contained in the ImageMagick
package available from

   http://www.imagemagick.org/

By supporting certain text files each single image could be
documented in English or German (or with slight changes to any
other language).


Example
-------

For those clever people who'd never read any documentation:
You could find a complete set of example files to convert a set of
Photo-CDs at

   http://fam-tille.de/pcd-data/
   
The unchanged result of this set of files could be found at

   http://fam-tille.de/sparetime.html

If you stop reading here you are left alone.  May be the poor
man page could help you a little bit, but this introduction
is intended to give you all help.  


Motivation
----------

There are many reasons for offering Photo-CD images in WWW browser
readable form.  Converting the images into JPG images is the first
step of this work.  There are three possibilities to do that:

 1. convert blindly with a dumb batch file
   --> This will end in poor quality and is absolutely not recommended.
 2. load PCD image into an image manipulation program, do some
    enhancements (as cropping, shrinking, normalizing histogram)
    and save it as JPG
   --> The quality of the output will be OK but it is a time consuming
    process and (that's the point) do you really know what you have
    done with your image??  Imagine you want to do a step in between
    the steps you have done and try to remember what you have done
    some days/months ago.
 3. use a batch method which is configurable for each image
   --> You will need some iterations to get the wanted result but
    you have always the chance to reproduce what you have done.
    Most basic stuff of image manipulation can be done by convert.
    May be it consumes some time to run the batch for an image over
    and over until you are satisfied.  But why not writing the
    describing text of actual image this time?
    

Converting images
-----------------
    
The third way is the way pcd2html goes.  In a rules file you define
all options for `convert` you want to use for each image to convert.
In fact you write at first a really simple rule:

The file rules has the line:
key:001  first_image

That does:

  convert <path_to_cd_imgs>/img0001.pcd key_001first_image.jpg
      
The rules file contains the user defined key of the Photo CD and the
number of the PCD image as well as the name you want to call the
converted image.  Now visit the image.  May be you consider to crop
the black border and shrink the image to fit on a normal browser page.
The new rules file looks like:

key:001  first_image [-crop 750x500+12+7 -geometry 600x450]

==>

  convert -crop 750x500+12+7 -geometry 600x45 \
          <path_to_cd_imgs>/img0001.pcd key_001first_image.jpg
  
All characters between [] will be used as options for convert.
So you can place any valid convert option here.

OK, what if the image file size is to big?  You could compress the
image.  By default pcd2html uses `-quality 75`.  The `-quality` option
is the only option you should avoid to place into [] because I'm not
sure, which option is used by convert.  Try the following:

key:001  first_image [-crop 750x500+12+7 -geometry 600x450] Q60

==>

  convert -crop 750x500+12+7 -geometry 600x45 -quality 60 \
          <path_to_cd_imgs>/img0001.pcd key_001first_image.jpg
  
Off course you can specify also lower compression if the image
shows ugly compression flaws, i.e. "Q85".

May be you have to fiddle around something with all these options.
But this is no problem.  The only thing you have to do is to change
the rules file and type `make`.  All things will be done automatically.

Consider choosing an image magnification from the Photo CD other than
the third (which uses convert by default).  OK, what about

key:001  first_image [-crop 1500x1000+24+14 -geometry 600x450] Q60 !4

==>

  convert -crop 1590x1000+24+14 -geometry 600x45 -quality 60 
          <path_to_cd_imgs>/img0001.pcd[4] 001first_image.jpg
  
You see there is full control over the converting process.
It is possible to comment out a line by a `#` at the first column
of a row.  This is useful if you want to save your last rules
of a file if unsure.

Each line of a rules file describes what to do with a certain image on
one of the Photo-CDs.  The user has to group his images into
subdirectories and create rules files in each subdirectory.  After a
change of one of the rules files type `make` in this directory and the
images with changed rules will be converted newly.  (To detect, which
rules where changed and which not a pcd2html internal comment is
added in the resulting HTML file which is described later.)

Hint: To get the crop coordinates of an image the program paul
      was very useful for me.  You can obtain paul from
      
          http://packages.debian.org/paul

If you just published your images to commons.wikipedia.org you might
like to put a link to this WikiPedia page.  Just include the name
at WikiPedia image this way: 

         {ThisImageAtWikiPedia.jpg}


Creating HTML files
-------------------
      
Suppose all images are converted as you like them.  Normally you now
would start hacking HTML files to present them in a reasonable manner.

Would it be astonishing for you that these files are ready yet?

OK, let me explain how it works.

After converting the image a HTML file is created, which simply loads
the image.  A style file called pcd.css is supported which could
be helpful to change the look of all pages in one step.

But that's not all!

I mentioned above that you could use the time while the batch conversion
runs for editing the HTML file.  Now I have to say: Never edit the
HTML file by hand!  All your changes will be lost if you run pcd2html
next time.  But there is a more clever way.

The user has the possibility to ship three or four files for each
   image: <key_nnn>.eng, <key_nnn>.deu, <key_nnn>.tec and <key_nnn>.drf
where <key_nnn> means key and number of the image as described above.

What are these beasts? 
In <key_nnn>.eng you can write any HTML text in English language.
The first line has a special meaning.  It is used as title of
the page and is NOT displayed  on the page itself.  May be it
would make sense it also to use as headline.  E-mail your
opinion about it and I will change it!
The rest is printed below the image as an English description.

I think you guess the meaning of <key_nnn>.deu.  This is for speakers
of German language.  I think it would not be a problem for non German
speakers to change pcd2html this way to support their favorite
language.  Important is that pcd2html supports two languages ... if
you like it.

What about the <key_nnn>.tec files?
This is for information which should be printed under both, the
English and the German text.  It is interesting for photographers if
they want to ship information like the used film, shutter speed,
aperture and so on.

The <key_nnn>.drf files have a very special meaning.  
I personally used to get some of my shots criticized in the newsgroup
de.rec.fotografie.  Because I like it for a better overview and to
learn how to make better photos I like to see the comments of the other
photographers near my photos.  So an optionally shipped <key_nnn>.drf
file causes the insertion of a certain link in the German page to a
HTML page which contains the comments to the actual image.  It makes
only sense in the German page because all text is in German.

Note: The input text of these files can contain 8-bit ISO characters.
      They will be converted to HTML syntax.

All these things are optional but quite useful.
The clue is that you get your HTML pages fast and can put them on the
net immediately after getting the Photo CD.  If you find the time to
note some comments to your images simply edit the <key_nnn>.eng,
<key_nnn>.deu or <key_nnn>.tec file call pcd2html and your HTML pages
are up to date each time.

How are the HTML pages ordered?
The pages in each subdirectory are ordered in this order as the images
are described in the rules file.  Each page has a link to its previous
and its following page if such a page exists.  So your slide show is
ready to present.  Each subdirectory has an index file with a small
thumbnail.

What to do if there is some stuff which didn't fit in that scheme?
You can ship a executable (shell script) called "extra" in each
directory and insert a line in the rules file

   extra  <image_file_got_by_extra>
   
What about the main index file?
In the main directory a rules file with different syntax is necessary.

It's recommended to use the first line as comment (starting with
`#`) and the following information:

   # <Photo-CD Name> <Version-Number>

These values are used if you create tar.gz archives of your pcd2html
data or the ready HTML output.  This is used in the following way:

   pcd2html data
   
   => stores all your user supported data in
      <Photo-CD Name>_data-<Version-Number>.tgz
      
   or

   pcd2html html
   
   => stores all necessary pcd2html output in
      <Photo-CD Name>_html-<Version-Number>.tgz
      
The first archive is what to have to store safely to rebuild all
necessary files. The second one is all you need to put into the Web or
give your friends or what also you want to do.

After these optional names follow the names of the subdirectories with
the titles of their index files in this syntax:

     subdir  {English title of subindex}  [German title of subindex]
     
The main rules file controls which subdirectories are visited to
search for the image rules files.  The main index file has links to
the indexes in the subdirectories.  These links are sorted into a
table and marked with thumbnails.  Which image is used for the
thumbnail can be declared in the rules file of the subdirectory:  The
image description line which is marked with an asterisk `*`, defines
the image which should be used here.  If there is no marked line in a
rules file, the first image is used.

At bottom the main index file has an additional link to the
other language.  I considered it to be useless to have a link to the
other language on each page, but it wouldn't be the faintest problem
to implement such a feature.

All HTML pages contain the meta tags with the date of creation, the
creating software, the name of the user which created the file (if
available in `finger` information).  In the main index file can be
attached furthermore a `keywords` meta tag.  This will be included, if
a file keywords.eng or keywords.deu exists.  Furthermore the files
contents.eng and contents.deu are responsible for the `contents` meta
tag.  Both tags are interesting for Web search machines.

Last but not least you should not forget to write the index files
index.eng and index.deu.  These files contain the text of the main page.
As usual the first line of these files are the title of the page
and here it is also used as headline.

Two comments have special meanings:
   # back: <address>
where <address> is the URL of the page one step deeper in your
pages hierarchy and
   # home: <home page>
marks your home page.  It is good style to give visitors a chance
to go back to a reasonable place and you shouldn't forget these
comments into your index.eng and index.deu files.
     

Requirements
------------

Operating system:
 - pcd2html was developed under Debian GNU/Linux.  There is a
   Debian package of it
 - It should work under any modern Linux system.
 - It should work under any modern UNIX system.  Problems could
   appear when trying to mount the CD-ROM drive automatically, but
   this is no real problem.
 - In principle pcd2html might work under any operating system which
   runs the tools listed below.  You have to patch PATH names.
   I accept and include patches for other systems but without any
   warranty.

ImageMagick:
 - Which version of convert you use depends from your needs to
   convert the image.  I used 
     ImageMagick 4.0.4 98/04/01 cristy@sympatico.org

pcd2html uses the following UNIX tools:
 - bash (GNU bash, version 2.01.1(1)-release)
   Some special bash features where used, I'm not sure if other
   shells could be used.  Lower bash versions should work, too.
 - GNU make (GNU Make version 3.76.1)
   There are features in GNU make which aren't available in other
   make varieties.  I expect no other make to work with pcd2html.
   Lower versions could work, too.
   Just try your favorite make and try if it works.
 - grep (grep (GNU grep) 2.1)
   No very specific features of GNU grep are used.  Every grep
   should work.
 - sed (GNU sed version 2.05)
   The sed programming in pcd2html is not very tricky, that it is
   expected to work with oder sed versions, too.
 - perl (version 5.004)
   Since pcd2html-0.2 much stuff was rewritten into perl.  This makes
   the beast more portable and might replace the last grep/sed lines
   in the future.  Lower perl versions could work.

I really hope that pcd2html is useful for you and look forward any
critics, bug fixes or enhancements

Andreas Tille <tille@debian.org>