[whatwg] PaceEntryMediatype

Mon Dec 4 10:54:04 PST 2006

On Mon, 4 Dec 2006, Thomas Broyer wrote:
> 
> There's no need to fetch every link if you base your assumptions on the 
> type="" attribute (and *only* the type="" attribute, not the combination 
> with any special rel="" attribute value). If you don't use the type="" 
> attribute on <link>s, you'll have many more requests than if you did, 
> because of the "fetch to discover the content-type" algorithm described 
> for <link> elements, but that's the author problem, and it's not limited 
> to feed autodiscovery, so…

So your proposal is to ignore the rel="" attribute altogether for feed 
autodiscovery? This seems contrary to what you were saying before, namely 
that there should be a way to give links to Atom documents that aren't 
feeds and have them not be autodetected. This isn't what you were 
proposing before as far as I can tell.

> > > No more than they already are. rel="alternate" is for linking to 
> > > alternate representations, and hundreds of millions of syndication 
> > > feed links are not using it that way; they already are 
> > > non-conforming.
> >
> > Fair enough. They still exist, though. Browser vendors aren't going to 
> > stop supporting this. We would be just sticking our heads in the sand 
> > if we ignored this.
> 
> Many things are marked as "deprecated" in earlier HTML versions, and
> are still supported by browsers.
> Also, as the misuse of rel="alternate" is not machine testable, and
> given that I don't propose "banning" the use of rel="alternate" for
> feed autodiscovery, I can't see how a browser vendor could "stop
> supporting this".

If you don't want browsers to implement the spec, why do you care what the 
spec says? I'm confused. If the specification is ignored by browser 
vendors, as you seem to be advocating, then the specification is useless.

> > When people link to an Atom document, they are giving a syndication 
> > feed. I'm sure theoretically there could be other uses of Atom, but 
> > from my studies of Web content, I haven't seen any evidence that this 
> > is widespread enough to deserve special treatment.
> 
> Seems like you really didn't understand my point…
> « rel="alternate" + type="RSS or Atom" means rel="feed alternate" »
> *is* special treatment.

Which is justified, since feeds _are_ widespread. I was saying that 
non-feed Atom documents are not widespread and that _they_ don't deserve 
special treatment, not the other way around.

> « type="RSS or Atom" means it's a syndication feed, whichever the
> rel="" value » is *not* special treatment.

It would be the only way to create a hyperlink link with a <link> element 
without also specifying a rel="" attribute; that seems like special 
treatment to me. It also seems like a *superset* of what the spec 
currently says. If you don't like what it says now, why would you like it 
to be even more general?

> « rel="feed" means it's a feed –with *my* definition of the "feed" 
> relationship–, whichever the type="" value » is *not* special 
> treatment.

The type="" attribute is not relevant when rel=feed is set, according to 
the current spec. Your definition was a subset of the current definition, 
and didn't cover some of the existing use cases. I don't see how it 
changes the processing of the type="" attribute at all.

> > > In 4.4.3.1 (Link type "alternate"), remove this paragraph:
> > > """If the alternate keyword is used with the type attribute set to the
> > > value application/rss+xml or the value application/atom+xml, then the
> > > user agent must treat the link as it would if it had the feed keyword
> > > specified as well."""
> >
> > Removing this paragraph breaks existing practices.
> 
> No, it doesn't.
> <link rel="alternate" type="application/rss+xml" href="A"> links to a
> syndication feed, not because of the rel="alternate" or its
> combination with the type="application/rss+xml", but just because of
> the type="application/rss+xml".

No, browsers need both to consider it a link to a feed.

> We have a problem with application/atom+xml because it can represent
> either a feed or a standalone entry, but the Atom WG is working on
> this issue (either we'll have a new 'type' parameter:
> application/atom+xml;type=entry, or a new media type:
> application/atom.entry+xml), so Atom won't be any different from RSS.
> And given that I redefine rel="feed" and feed autodiscovery (see
> below), the above quoted paragraph is no longer appropriate.

I don't see how.

> > > Remove rel="feed" or, if you really think it's different from
> > > rel="index", define it that way:
> > > """The feed keyword indicates that the referenced document is a
> > > syndication feed which is or has been linking to the current page as a
> > > feed item.
> > > For example, in a Web log, a page representing a single entry can link
> > > to the Web log homepage and/or the Web log's Atom or RSS feed using
> > > using the link type feed."""
> >
> > There are syndication feeds that don't fit this definition.
> 
> Of course yes, and they will be discovered based on the content-type,
> and rel="" will deserve its real role: describing the relationship
> between the two resources (and not describing the other end of the
> link).

We don't want to rely on the content type, because that isn't scalable or 
extensible.

> Definition of feed: a bag of items; the representation of a feed
> generally exposes only the 10, or so, latest created or updated items.
> You'll note that this has nothing to do with the feed "format" (Atom,
> RSS, a Web log's homepage in HTML, etc.)

Exactly.

> If a document was once linked from a feed's representation as an item,
> it is an item of this feed, even if the feed's current representation
> doesn't link to it anymore. The relationship still exists. This
> relationship is "I am an item of this feed" or "this is a feed within
> which I once appeared". I propose representing it as rel="feed".

But the page might never have been in the feed, as previously discussed.

> > For example, a home page could link to various feeds for things like 
> > planned outages, news, press releases, etc, not all of which might be 
> > on the page itself.
> 
> What do you mean by "might be on the page itself"?

I mean that the feed might contain items that were never part of the page 
linking to the feed. For example, this page:

   <!DOCTYPE HTML>
   <title>Feeds for this site</title>
   <link rel=feed href=status.xml>
   <link rel=feed href=news.xml>
   <link rel=feed href=links.xml>
   <p>This page links to the three feeds for this site.

There are no items on that page, but it links to three feeds that the site 
provides.

> Anyway, if you link to something, there's a reason. This reason is that 
> there is a relationship between the current document and the thing the 
> link points to. This relationship is described in the rel="" attribute.

Not really. There are many link types these days that don't really 
describe a relationship between documents. nofollow, for example. The 
"real world" has made it clear that sometimes, the "link types" aren't 
actually relationships.

> Is what you want an algorithm for feed autodiscovery?
>
> for each <link> in the document:
>     if @rel="feed":
>         if canSubscribeTo(@type)*:
>             add to list of "links to feeds"
>     if isSubscribable(@type)*:
>         add to list of "links to feeds"
> 
> * canSubscribeTo: <link rel="feed"> could point to an HTML page.
> There's no reason some aggregators couldn't subscribe to it if it uses
> HTML5's <article> or the hAtom microformat, for example. If this is
> the case, canSubscribeTo would return true for "text/html". @type is
> the type="" attribute if present, or the Content-Type returned by a
> fetch.
>  * isSubscribable: returns true if the type is any recognized
> "subscribable content type": RSS  or Atom, but also
> text/vnd.IPTC.NewsML for example. @type is the type="" attribute if
> present, or the Content-Type returned by a fetch.

The above is pretty much what the spec says, except that if it _doesn't_ 
have rel=feed, then it isn't added to the list of feeds, because the 
author didn't want it added to the list of feeds for some reason. For 
example, in a browser that thinks text/html is subscribable as you 
suggest, any <link> to an HTML page would count as subscribable, even 
things like copyright licenses. This is clearly not what the author wants, 
and would be bad for usability.

It really isn't clear to me what problem you see with the current spec 
that you are trying to solve. I thought I knew, but your proposals suffer 
from the same problems, so I'm guessing I was wrong.

On Mon, 4 Dec 2006, Michel Fortin wrote:
> 
> I'd like to suggest a possible solution that would address these two issues at
> the same time. The type attributes allows for parameters after the mime type.
> So what about this:
> 
>     <link rel="alternate" type="application/atom+xml;role=feed" src="...">
>     <link rel="alternate" type="application/atom+xml;role=entry" src="...">
>     <link rel="alternate" type="application/atom+xml;role=edit" src="...">
> 
> If the type parameter "role" is not present, "role=feed" would be implied.

Thomas mentions that this is indeed being considered. It would indeed 
solve the problem.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'