w= h=



  • <--c--> . . . . . . . v?

    the test-suite for
    project gutenberg

    a document containing the
    full range of features found
    in project gutenberg e-texts

    by bowerbird intelligentleman

     greetings, earthling...

     this is an e-text brought to you by project gutenberg,
    a 35-year-old volunteer effort to put literature online.

     please see the web-site for news and information on
    usage conditions for e-texts, volunteering, and more...

     http://www.gutenberg.org
    


    <--c-->

    table of contents

     the test-suite for project gutenberg
    table of contents
    dedication
    chapter 1 -- welcome aboard
    chapter 2 -- the sections of the book
    chapter 3 -- text "styling"
    chapter 4 -- plain ascii versus unicode
    chapter 5 -- poetry and other silly things
    chapter 6 -- tables
    chapter 7 -- centered text
    chapter 8 -- pictures in your book
    chapter 9 -- footnotes and endnotes
    chapter 10 -- hotlinks
    chapter 11 -- hyphens and dashes
    chapter 12 -- hyphenation stinks
    chapter 13 -- unlucky 13
    chapter 14 -- two spaces after a sentence
    chapter 15 -- multi-purpose block-quotes
    chapter 16 -- the play is the thing
    chapter 17 -- epigraphs and epitaphs
    chapter 18 -- lists in your book
    a subsection of chapter 18
    another subsection of chapter 18
    chapter 19 -- the meta-data chapter
    chapter 20 -- zen markup language
    chapter 21 -- the end
    the notes section
    meta-data


    <--c-->

    dedication

     to michael hart

     for his vision
    and his persistence...

     left-justified

     centered
     right-justified

     this just has one tab...

    did they all work right?


    <--c-->

    chapter 1

    welcome aboard

    welcome!

    this is a document intended to demonstrate the range of features common throughout the e-texts in the project gutenberg library, and indeed to the majority of printed books.

    project gutenberg is a volunteer effort -- please see http://www.gutenberg.org -- digitizing the text of public-domain books, for viewing and distribution in cyberspace.

    it was begun by michael hart in 1971, with the goal of creating 10,000 e-texts, a milestone that was achieved in 2003, with boosts from distributed proofreaders -- at http://www.pgdp.net -- an advance that allows people to proofread online, thousands of 'em doing a page at a time, using bits and pieces of their spare time.

    if you want to support the p.g. library with some kind of software or markup system, you should be able to handle its features, and you can use this file as a "test-suite" to verify that your system is fully capable.

    this document should be self-explanatory. tabs have been substituted with "", so that they will become visible to you, so they can be changed back for your testing. other than that, no changes should be needed.

    if you find any inconsistencies in this test-suite, do please let us know immediately. thank you.


    <--c-->

    chapter 2

    the sections of the book

    first, you should be able to handle headings of different levels, such as the book, chapters, and subsections.

    you may label the levels as you like.

    html can support 6 different levels, so that's a good number to shoot for.

    one of the things that users find handy is a table of contents for the e-book, so you must be capable of generating one, in cases where an e-text doesn't have one.

    because of their experience with the web, people often expect this table of contents will be hotlinked to the appropriate sections, so your markup system should facilitate that. a nice touch is then to have chapter headings then link back to the table of contents...


    <--c-->

    chapter 3

    text "styling"

    project gutenberg was born a very long time ago, before word-processors and personal computers.

    a rumor is that michael used a keypunch machine (it's ok if you're too young to know what that is) to enter a good number of the original e-texts...

    computers didn't even have lower-case characters in the early days, so the whole book was capitalized! luckily, before long we got lower-case characters.

    but still, some "luxuries" like italicized and bold text were not possible, so michael developed a convention where a word that was bold or italics in the original was entered in all-upper-case, to show the emphasis.

    because the e-texts are stored as raw ascii text, that convention lives on, to this day, in some files. by this time, however, we need to be able to handle styled text, so your systems must be able to do so.


    <--c-->

    chapter 4

    plain ascii versus unicode

    most english e-texts in the library can be represented in the lower-ascii characters, but future e-texts are likely to require some unicode characters, so you should definitely be able to handle text-encoding.


    <--c-->

    chapter 5

    poetry and other silly things

    many of the e-texts contain poetry, or verse of some type, so your system must be able to handle silliness like that.

    some poems want to be left-justified, so you should be able to handle that:

       a haiku for you
    (by bowerbird intelligentleman)

       haiku have three lines
    and seventeen syllables,
    five, seven, and five.

    other poems want to be centered instead:

         t.v. will eat you    
    (by bowerbird intelligentleman)

    t.v. will eat you
    out of a satellite dish
    with a tuning fork

    in general, lines of a poem would prefer to stay together, that is, to be kept all on a single page whenever possible, so your system should attempt to accomplish that feat.

    if it's not possible to keep the whole poem on one page, try to make the page-break occur between the verses...


    <--c-->

    chapter 6

    tables

    there aren't a whole lot of tables in the e-texts -- we're talking literature, not spreadsheets -- but your system should handle tables anyway; not really big and hairy ones, just simple ones.

    _____column 1______ _____column 2______
    _____plain-text______ _____yes______ _____yes______
    _____x.m.l.______ _____no______ _____yes______
    _____h.t.m.l.______ _____yes______ _____no______
    _____.rtf______ _____no______ _____yes______
    _____.pdf______ _____no______ _____no______


    <--c-->

    chapter 7

    centered text

         center me please!    

    sometimes, for one reason or another, some of an e-text's lines are centered. so your system should be able to do that.


    <--c-->

    chapter 8

    pictures in your book

    most of the p.g. e-texts are text-only. but some of them do have pictures, so your system must be able to show 'em.

    
http://snowy.arsc.alaska.edu/bowerbird/alice01/alice01/checking_watch.png http://snowy.arsc.alaska.edu/bowerbird/alice01/alice01/checking_watch.png

    put a picture here, or maybe a button that someone could click in order to view that picture...

    "what is the use of a book," thought alice, "without pictures or conversation?"

    
http://snowy.arsc.alaska.edu/bowerbird/alice01/alice01/alice_cramped.png http://snowy.arsc.alaska.edu/bowerbird/alice01/alice01/alice_cramped.png


    <--c-->

    chapter 9

    footnotes and endnotes

    some of the e-texts have footnotes.[1]
    body note . . . . . . 35. . . 3. . . [1]

    your system must be able to handle them. how it might do that is up to you, captain.

    
http://snowy.arsc.alaska.edu/bowerbird/alice01/alice01/alice_holding.png http://snowy.arsc.alaska.edu/bowerbird/alice01/alice01/alice_holding.png


    <--c-->

    chapter 10

    hotlinks

    remember how, in chapter 2, we said that the table of contents should be hot-linked to the appropriate spots?

    those are one type of links you need. there are several other types as well.

    your system should be able to make jumps to an internet site. most of the e-texts are quite old, so of course it's not like they have a bunch of internet u.r.l.'s in them; but every e-text will indeed contain a link to project gutenberg's website, so you must be able to execute links...

    quite often there are places in an e-text that reference other parts of the e-text. in these cases, it's nice to have a hotlink close to (or on) that reference point that transports the reader directly to the place that is being referenced; it is convenient. your system should facilitate such linking, preferably making it happen automatically.

    for instance, the beginnng of this chapter has a reference to chapter 2. if a reader were to click on the words, "chapter 2", they should automatically go to chapter 2. (and likewise with each of the references to "chapter 2" here in this paragraph too.)

    note this functionality is not yet present in this version of this program. sorry! :+)


    <--c-->

    chapter 11

    hyphens and dashes

    i use a hyphen between "e" and "text" in "e-text". not everyone does, but i think that it looks nicer.

    a hyphen -- as you know -- differs from a dash. and you probably know that there are even two (and some people say more!) types of dashes...

    the first - called an "en-dash" - is a narrow one. you will see these in a fair number of the e-texts. it's called an "en" dash because it was traditionally defined as being exactly as wide as the letter "n". (or, some say, as wide as a letter "n" is high.)

    the second -- called an "em-dash" -- is wider, and yes, it's called that because it's as wide as an "m", or so the story goes, according to some people...

    generally, try to use an em-dash, not an en-dash. the en-dash looks too much like a hyphen, especially when it is run into the words that are surrounding it.

    now, the convention says that you should not have spaces on the sides of a dash. the convention is wrong. it looks much nicer if you put spaces around a dash. perhaps even more importantly, the search capability of many programs is thrown off if you don't use spaces, and so are the re-margination routines in many programs. so -- to avoid these problems -- put spaces around dashes.

    a problem arises, though, because there is no em-dash in the lower-ascii codes. so you have to use a double-dash -- like these here -- for an em-dash. ok, problem solved.


    <--c-->

    chapter 12

    hyphenation stinks

    hyphenation is another thing that messes up those search capabilities. e-books don't need hyphenation. so turn hyphenation off when you make an e-book.


    <--c-->

    chapter 13

    unlucky 13

    there is no 13th floor in most buildings.


    <--c-->

    chapter 14

    two spaces after a sentence

    back in the old typewriter days, students were instructed to put two spaces after a sentence.

    ever since the computer, though, some people have said two spaces are no longer required, that it is an unnecessary leftover from earlier times.

    those people were wrong. you still need two spaces.

    nonetheless, some e-texts now only have one space after each sentence, so a good viewer-program should fix that, by displaying an extra space when it's absent.

    but of course, where there already are two spaces, it should not add an extra, as it would be too much.

    also, there are some periods -- like in "mr. hart" and other abbreviations -- that do not indicate the end of a sentence, so no extra space should be added there, and the program must be intelligent enough to know that.

    luckily, it's only the programmers of viewer-apps that need to worry about this. if you are making an e-book, just put a single space after all sentences.


    <--c-->

    chapter 15

    multi-purpose block-quotes

    sometimes you wanna quote a whole bunch of stuff from someone. this is often called a "block-quote".

    many of the project gutenberg e-texts contain block-quotes of one various type or another.

    here's an example of a block-quote, a letter.

     dear leslie,

    how are you? i am fine.
    the weather is nice here.
    but i wish it was half
    as beautiful as you are.

    and i wish you were here.

    love,
    bowerbird

    typically, block-quotes are indented on both the left and right sides.

    here's another block-quote, from a speech.

         four score and seven years ago, our
    forefathers set forth upon this continent
    a new nation, conceived in liberty and
    dedicated to the proposition that
    all men are created equal.[2]
    body note . . . . . . 31. . . 3. . . [2]

    there are a number of different situations throughout the e-texts that might call for this type of indentation. for now, we will just subsume them all under "block-quote"; perhaps later we will see fit to break out a more finely-grained analysis if we find any of the cases merit their own class.


    <--c-->

    chapter 16

    the play is the thing

    dale: that's not what p.g. is all about.

    bowerbird: i think it's important to
    give people a good e-book experience.

    dale: that's your opinion.

    bowerbird: yes it is.

    steve: (weakly) i can't...

    dale: no it isn't.

    steve: (weakly) get a...

    bowerbird: is too.

    steve: (weakly) word in edgewise...

    dale: is not.

    lurkers: will you two cut it out?

    bowerbird: is so.

    dale: is not...

    fade to black.[3]
    body note . . . . . . 14. . . 3. . . [3]

    there are plays in the library. gotta be able to handle plays. dialog, instructions to actors, stage directions, that stuff...


    <--c-->

    chapter 17

    epigraphs and epitaphs

     there's an old proverb
    that says just about
    whatever you want it to...
    -- slashdot

    sometimes a chapter starts with a nice pithy quote, which is usually italicized, and often right-justified.

    so you wanna be able to handle that kind of thing.[4]
    body note . . . . . . 50. . . 3. . . [4]


    <--c-->

    chapter 18

    lists in your book

    i like lists. here's a list:

     1.  one
    2. two
    3. three
    4. four
    5. five
    6. six
    7. seven
    8. i forget what 8 was for.
    9. number 9, number 9...

    gotta be able to handle lists...


    <--c-->

    a subsection of chapter 18

    sometimes you want to indent the items. here's an example of an indented list:

     1.  one
    2. two
    3. three
    4. four
    5. five
    6. six
    7. seven
    8. i still forget what 8 was for.
    9. number 9, number 9...

    still gotta be able to handle lists...


    <--c-->

    another subsection of chapter 18

    here's another example of a list:

     * mercury
    * venus
    * earth
    * mars
    * jupiter
    * saturn
    * uranus
    * neptune
    * pluto

    like poems, items in a list generally want to stick together on the same page, if possible.

    still gotta be able to handle lists...


    <--c-->

    chapter 19

    the meta-data chapter

    a lot of people think "meta-data" is important. i think they're full of poop, but why not make 'em happy?

    so give them their own section -- call it the "meta-data section" -- and then let them put whatever makes 'em happy into that section.

    you will find the meta-data section toward the very end of this document, where it belongs, after the "real" data.


    <--c-->

    chapter 20

    zen markup language

    this test-suite document is a demonstration of z.m.l. -- "zen markup language" -- a system by which a set of simple formatting rules can take the place of complicated markup languages.

    this document is "marked up" in z.m.l. and will spring to life when displayed by a z.m.l.-viewer.

    furthermore, a z.m.l.-viewer can perform all of the tasks necessary to implement the features that this test-suite represents: the hot-linking, the styling, different layouts, tables, pictures, formatting for plays, the lists, the whole thing, without the difficulty of heavy markup languages.


    <--c-->

    chapter 21

    the end

    we hope you've enjoyed this test-suite document. if you have any questions, feel free to ask them.

    this is a draft, so please suggest improvements. and if you want to make your own test-suite, do!

    oh yeah, since he attained the 10,000 e-text mark, michael has a new goal now -- one million e-texts!

    maybe you can help him with his new goal? :+)

     http://www.gutenberg.org
    

    have a nice day.

    the end.


    <--c-->

    the notes section

    [1] personally, i don't think we need to
    end note . . . . . . 0. . . 3. . . [1] make a distinction between footnotes and endnotes any more, i believe that all the types of notes should be stored at the end of the file, like these notes, but i think the person should be able to display them at the point of reference in the actual body of the text. therefore, they are actually a sort of hybrid between footnotes and endnotes, combining the strengths and convenience of both types.

    [2] in later years, it was made clear that
    end note . . . . . . 0. . . 3. . . [2] lincoln was referring to all "people", and not just men, that women are equally equal.

    [3] this is a test footnote. because of that,
    end note . . . . . . 0. . . 3. . . [3] it's going to go on and on and on. this is a test footnote. because of that, it's going to go on and on and on. this is a test footnote. because of that, it's going to go on and on and on. this is a test footnote. because of that, it's going to go on and on and on. this is a test footnote. because of that, it's going to go on and on and on.

    > look, it even has a second paragraph! this is a test footnote. because of that, it's going to go on and on and on. this is a test footnote. because of that, it's going to go on and on and on. this is a test footnote. because of that, it's going to go on and on and on.

    > oh no! a third paragraph. way too long! this is a test footnote. because of that, it's going to go on and on and on. this is a test footnote. because of that, it's going to go on and on and on. this is a test footnote. because of that, it's going to go on and on and on. this is a test footnote. because of that, it's going to go on and on and on. this is a test footnote. because of that, it's going to go on and on and on.

    [4] this is another test footnote. but it will be short.
    end note . . . . . . 0. . . 3. . . [4]


    <--c-->

    meta-data

     * title = the test-suite for project gutenberg
    * author = bowerbird intelligentleman
    * for = project gutenberg
    * website = http://www.gutenberg.org


    <--c-->













































































    valid html 4.01 transitional

    
    
    2025/06/05 -- 22:22:09/thu