Table of Contents
Earlier, I mentioned the possibility of processing a document more than once to format the contents in different ways. That's what I'm going to do this time. I kept this example intentionally ultra-simple because I want to concentrate on the mechanics of modes rather than confuse the issue by making hyperlinks from the table of contents to the entries. I've also chosen to add information in the table of contents that isn't explicitly present in the XML document itself. The XML input document is shown in Listing 19.14 and the DSSSL style sheet is shown in Listing 19.15.
|
Listing 19.14 XML Cookbook File 4 (Table of Contents)
|
1: <?xml version="1.0"> 2: <!DOCTYPE DOC [ 3: <!ELEMENT DOC (TITLE | SECTION)*> 4: <!ELEMENT SECTION (TITLE | CAUTION | NOTE | PARA)*> 5: <!ELEMENT TITLE (#PCDATA)> 6: <!ELEMENT NOTE (#PCDATA)> 7: <!ELEMENT PARA (#PCDATA | TITLE)*> 8: <!ELEMENT CAUTION (#PCDATA)> 9: ]> 10: <DOC> 11: <TITLE>Simple XML to HTML Conversion</TITLE> 12: <SECTION> 13: <TITLE>Introduction</TITLE> 14: <PARA>This sample document demonstrates how you can 15: create a table of contents. This code uses two DSSSL modes. The 16: most important (toc) model allows us to process the document twice, 17: once to extract the text we need for the TOC and once for 18: the normal document formatting. 19: element.</PARA> 20: <NOTE>This is an extremely powerful feature of DSSSL that is well 21: worth learning.</NOTE> 22: <PARA>This second paragraph proves that normal text is still 23: untouched.</PARA> 24: </SECTION> 25: <SECTION><TITLE>Going Further</TITLE> 26: <CAUTION>Don't forget to use the debug features when developing 27: DSSSL style sheets with jade.</CAUTION> 28: <PARA>If you look at the code, you'll see I used two modes, not one, 29: I use a toc mode on the document to extract the TITLE elements to 30: include them in the TOC, and another mode to extract the text 31: contained within the TITLE element since I don't want 32: them to be formatted.</PARA> 33: </SECTION> 34: <SECTION><TITLE>And then?</TITLE> 35: <PARA>Even when XSL takes off, DSSSL is going to 36: be around for a long time yet (if for no other reason than 37: that it is an ISO standard). DSSSL (with jade) is still worth 38: learning.</PARA> 39: </SECTION> 40: </DOC>
|
Listing 19.15 DSSSL Cookbook File 4 (Table of Contents)
|
1: <!DOCTYPE style-sheet 2: PUBLIC "-//James Clark//DTD DSSSL Style Sheet//EN"> 3: <!DOCTYPE style-sheet 4: PUBLIC "-//James Clark//DTD DSSSL Style Sheet//EN"> 5: (declare-flow-object-class element 6: "UNREGISTERED::James Clark//Flow Object Class::element") 7: (declare-flow-object-class document-type 8: "UNREGISTERED::James Clark//Flow Object Class::document-type") 9: (declare-flow-object-class empty-element 10: "UNREGISTERED::James Clark//Flow Object Class::empty-element") 11: (declare-flow-object-class formatting-instruction 12: "UNREGISTERED::James Clark//Flow Object 13: Class::formatting-instruction") 14: (define debug 15: (external-procedure "UNREGISTERED::James Clark//Procedure::debug")) 16: 17: <![CDATA[ 18: (define RED-ON (make formatting-instruction 19: data: "<FONT COLOR='RED'>")) 20: (define RED-OFF (make formatting-instruction data: "</FONT>")) 21: (define RULE (make formatting-instruction data: "<HR>")) 22: (define START (make formatting-instruction data: "<P>")) 23: (define STOP (make formatting-instruction data: "</P>")) 24: ]]> 25: 26: (define (make-special-para) 27: (make sequence 28: (make element 29: gi: "P" 30: (make element 31: gi: "B" 32: (literal (string-append (gi) ":")))) 33: (make element 34: gi: "BLOCKQUOTE" 35: (process-children)))) 36: 37: (element DOC 38: (sosofo-append 39: (make document-type name: "HTML" 40: public-id: "-//W3C//DTD HTML 3.2//EN") 41: (make element gi: "HTML" 42: (sosofo-append 43: (make element gi: "HEAD" 44: (make element gi: "TITLE" (sosofo-append 45: (literal "Simple XML-to-HTML Conversion"))) 46: )) 47: (make element gi: "BODY" 48: (make sequence 49: (make element gi: "H2" (sosofo-append 50: (literal "Table of Contents"))) 51: (with-mode toc (process-matching-children 'section)) 52: (process-children)))))) 53: 54: (mode extract-title-text (element (TITLE) (process-children))) 55: 56: (mode toc 57: (element section 58: (make sequence 59: START 60: (literal "Section ") 61: (literal (format-number (child-number) "1")) 62: (literal " ... ... ... ... ... ") 63: (with-mode extract-title-text 64: (process-first-descendant "TITLE")) 65: STOP))) 66: 67: (element (DOC TITLE) 68: (make sequence RULE (make element gi: "H2" ))) 69: 70: (element (SECTION TITLE) (make element gi: "H3" )) 71: 72: (element NOTE (make-special-para)) 73: 74: (element CAUTION 75: (make sequence 76: RED-ON (make-special-para) RED-OFF)) 77: 78: (element PARA (make element gi: "P"))
| The secret to making all this work is to use a thing called a mode. A mode is basically a different way of processing the document. To make the table of contents I need two modes. |
| A mode is a named set of processing instructions. By setting conditions for when a mode is triggered, you can implement conditional processing. By including several modes you can process a document several times, but in different ways. For example, you would use one mode to process the complete contents of a document to format it, but another mode to extract and format only selected elements to generate a table of contents or an index. |
The sole purpose of the first mode is to extract the children of the title elements. Because these elements don't have any child elements, the process-children instruction simply processes the text that they contain:
(mode extract-title-text
(element (TITLE)
(process-children)))
Now I need a second mode that specifies how to handle this extracted text. This is the mode that actually creates the table of contents:
(mode toc
(element section
(make sequence
START
(literal "Section ")
(literal (format-number (child-number) "1"))
(literal " ... ... ... ... ... ")
(with-mode extract-title-text
(process-first-descendant "TITLE"))
STOP)))
The formatting specification is triggered by a SECTION element. I then create a sequence consisting of an opening P tag, using the START procedure, some literal text, the section number, the text from the title, and a closing P tag using the STOP procedure. There are two special things about the contents of this specification:
- • The number-Even though I haven't numbered the sections (I could use the style sheet to do this too), each element still has its place in the tree: the child-number. I wrap the child-number in a format-number that specifies how I want the number displayed. This could be 1 (decimal numbers: 1, 2, 3), "A" (uppercase alphabet: A, B, C), "a" (lowercase alphabet: a, b, c), "I" (uppercase roman: I, II, III, IV), or "i" (lowercase roman: i, ii, iii, iv).
- • Using the mode-I tell the DSSSL engine to use the mode with the with-mode keyword and then I tell it to only process the first child TITLE element. There won't actually be any other child TITLE elements, but I don't want to include all the paragraph text (which is what I'd get if I said process-children).
This is an XML-to-HTML conversion so we can look at the output code again, shown in Listing 19.14. The Web browser display is shown in Figure 19.10.