logo back up home forward   further reading more topics »

Web Site Content Management - Publish to PDF

It is quite complex to go directly from a html website to PDF so I am proposing to do this in stages:

I also considered loading the website into Word or OpenOffice and then converting to XSL-FO although this would require more manual intervention. The native Open Office format is an XML file which could be converted to XSL-FO using an XSLT script.

Website HTML to XHTML

This feature requires the Kontent to read each html file being managed (every html file under the current directory), to convert content as described here:

http://www22.brinkster.com/beeandnee/techzone/articles/htmltoxhtml.asp

Then to write the file to disc as a xhtml.

The program already has code which reads html (using SAX) to get the title for the index generator. So this would require a new tab similar to the others but works as discribed here.

If it is too hard to handle CSS that could be left for a future stage.

If <p> is found then the program needs to check for a closing </p> and if not found it needs to be inserted before the next format.

The new files should be put in a new directory tree with the same structure as the original files. All html (and htm) files are converted to xhtml, other files such as .gif and .jpeg are copied into the new directory tree unchanged.

node.convertToXhtml
nodeDir.convertToXhtml
nodeHTML.convertToXhtml

convert XHTML to XSL-FO

Here is a XSLT script to do the convertion http://www.antennahouse.com/XSLsample/XSLsample.htm

This http://www-106.ibm.com/developerworks/library/x-xslfo2app/ explans the issues.

XSL-FO to PDF

see http://xml.apache.org/fop/

 


metadata block
see also:

 

Correspondence about this page

Book Shop - Further reading.

Where I can, I have put links to Amazon for books that are relevant to the subject, click on the appropriate country flag to get more details of the book or to buy it from them.

cover XSLT - Shows how to convert between XML document formats

Commercial Software Shop

Where I can, I have put links to Amazon for commercial software, not directly related to this site, but related to the subject being discussed, click on the appropriate country flag to get more details of the software or to buy it from them.

cover JBuilder - There is also a free version of Jbuilder at borland website . However its licence conditions are quite restrictive so you may prefer another java IDE.

Can this page be improved?

Please send me any improvements to here. I would appreciate ideas to make the pages more useful including error correction, ideas for new pages, improvements to wording. It helps if you quote the full URL of the page.

 

progam

I am working on a project which uses these principles, if you would like to help me with this you are welcome to join in, here:

for kontent: http://sourceforge.net/projects/kontent/

This site may have errors. Don't use for critical systems.

Copyright (c) 1998-2008 Martin John Baker - All rights reserved - privacy policy.