[whatwg] Interpretation issue: can <section> be used for "extended paragraphs"?
Jukka K. Korpela
jkorpela at cs.tut.fi
Wed Jun 15 00:09:12 PDT 2011
2011-06-14 10:32, Ian Hickson wrote:
> On Thu, 10 Mar 2011, Jukka K. Korpela wrote:
>> A sentence in the text may continue with list items, displayed e.g. as a
>> bulleted list. So the list breaks the paragraph as a block of text but
>> not logically - the list items are part of the sentence just as they
>> would be if they were just mentioned in the text, for example using 1)
>> numbers in the text, 2) letters in the text, or 3) no special notation.
> Indeed, but "block of text" is pretty much what a paragraph is -- it isn't
> a logical construct.
The word "paragraph" is ambiguous, as he current wording says indirectly
but clearly: It defines "The p element represents a paragraph", but the
word "paragraph" links to the following:
"The term paragraph as defined in this section is distinct from (though
related to) the p element defined later. The paragraph concept defined
here is used to describe how to interpret documents.
A paragraph is typically a run of phrasing content that forms a block of
text with one or more sentences that discuss a particular topic, as in
typography, but can also be used for more general thematic grouping. For
instance, an address is also a paragraph, as is a part of a form, a
byline, or a stanza in a poem."
So it says that p is a paragraph, linking to an explanation that says
paragraph is different from p. The explanation mentions "the term
paragraph as defined in this section", but it does not give a definition
- a sentence that begins with "A paragraph is typically" is a prelude to
a definition, not a definition.
I gather that "The p element represents a paragraph" more or less means
"the p element denotes a block of text". Can you make this more
explicit, please? This is very confusing even to people who regularly
read specifications for breakfast. In the current wordings, there are
_two_ dimensions of vagueness of "paragraph": whether it is the
classical concept of text that discusses one topic or the layout concept
roughly corresponding to the old HTML block concept; and whether it is
about explicitly marked-up elements (<p>...</p>) or more generally about
constructs whose "paragraphness" might be inferred by some rules.
It would probably be best to dispense with the word "paragraph", as many
people can't avoid thinking that paragraph is something logical, not the
layout concept of a block of text. Nut unfortunately, in HTML heritage,
the p element is not _purely_ a block of text. In addition to the name
and old descriptions, it associates with the logical concept of
paragraph, since p elements have default top and bottom margins. So they
differ from div elements. A div element containing only text can be
characterized as a block of text, and so can a p element, but there's a
Maybe something like the following might express this:
The p element represents a block of text so that consecutive p elements
are regarded as topically distinct. The name p comes from "paragraph",
and the p element typically corresponds to a paragraph in prose, i.e. a
subdivision of text that deals with one point or gives the words of one
speaker in a discussion. However, the p element is also used for other
thematic grouping, for example for a byline, a mailing address, for a
label and associated field in a form, for a byline, or for a stanza in a
Visual browsers are expected to render p elements by default with empty
lines before and after, caused by default top and bottom margin.
>> a) for styling purposes (you need a container element so that you can
>> specify, without clumsily using classes on both the P and the UL, e.g.
>> that vertical spacing be reduced or zero)
> <div> handles this case:
> <div>This sentence
> <li>a list
> ...and is made of four paragraphs but can be styled as one since the
> <div> element is used instead of<p>.</div>
But if this follows, for example, a table, then extra measures would be
needed to create vertical spacing. Using the p element would make the
spacing the default. Similarly for spacing after this construct. So it
would be more robust to use <p>...</p> markup here. Or you would need to
assign style properties to the div element, effectively making it
formatted the same way as p elements normally are, in your document.
I don't think anonymous blocks of text are a good idea. There was a
reason why they were frowned upon in HTML 4.01. After years of favoring
<p>...</p> as a container, as opposite to the original idea that allowed
<p> as an empty element indicating paragraph break, it seems odd to give
so many examples with "loose" text.
So I hope an example like the above but with <p>...</p> markup can be
added, to answer the common question (which is often formulated in terms
of a "list header", but it's really about something that starts as a
paragraph and then moves to listing things down as a bulleted list).
Maybe an explanation like this might be added (perhaps even after the
definition of p, as it really clarifies the concept):
Within a p element, only phrasing content (previously called
"text-level" content) is allowed. This implies that it cannot contain a
list element or a table element for example. A part of a document that
discusses one topic is normally marked up as one p element, but if it
contains lists for example, it needs to be marked up as one more p
elements intermixed with (not containing) one or more lists. The part
may be marked up as a div element to group the elements for the purposes
of styling and scripting, for example
<p>This is text, which may be just list header (introduction to
the list) or a longer presentation.
<p>Here we may have text that logically continues the discussion
of the topic.</p>
* * *
I know this suggestion is long and raw, but I hope its basic content is
something we can agree on. And I have no big problem with using div
markup here, even though it somewhat goes against the spirit of modern HTML.
More information about the whatwg