Problem Statement: URL
draft-ruby-url-problem-00
This document is an Internet-Draft (I-D).
Anyone may submit an I-D to the IETF.
This I-D is not endorsed by the IETF and has no formal standing in the
IETF standards process.
The information below is for an old version of the document.
| Document | Type |
This is an older version of an Internet-Draft whose latest revision state is "Expired".
|
|
|---|---|---|---|
| Authors | Sam Ruby , Larry M Masinter | ||
| Last updated | 2014-12-17 | ||
| RFC stream | (None) | ||
| Formats | |||
| Stream | Stream state | (No stream defined) | |
| Consensus boilerplate | Unknown | ||
| RFC Editor Note | (None) | ||
| IESG | IESG state | I-D Exists | |
| Telechat date | (None) | ||
| Responsible AD | (None) | ||
| Send notices to | (None) |
draft-ruby-url-problem-00
Internet Engineering Task Force S. Ruby, Ed.
Internet-Draft IBM
Intended status: Informational L. Masinter
Expires: June 20, 2015 Adobe
December 17, 2014
Problem Statement: URL
draft-ruby-url-problem-00
Abstract
This document lays out the problem space of possibly conflicting
standards between multiple organizations for URLs and things like
them, and proposes some actions to resolve the conflicts. From a
user or developer point of view, it makes no sense for there to be a
proliferation of definitions of URL nor for there to be a
proliferation of incompatible implementations. This shouldn't be a
competitive feature. Therefore there is a need for the organizations
involved to update and reconcile the various Internet Drafts,
Recommendations, and Standards in this area.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on June 20, 2015.
Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
Ruby & Masinter Expires June 20, 2015 [Page 1]
Internet-Draft Problem Statement: URL December 2014
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Brief History of URL standards . . . . . . . . . . . . . . . 2
2. Current Organizations and Specs in Development . . . . . . . 3
2.1. IETF . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2. WHATWG . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.3. W3C . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.4. WebPlatform . . . . . . . . . . . . . . . . . . . . . . . 4
2.5. Unicode Consortium . . . . . . . . . . . . . . . . . . . 4
3. Problem Statements . . . . . . . . . . . . . . . . . . . . . 4
4. Outline of Potential Solution . . . . . . . . . . . . . . . . 5
5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 5
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 5
7. Security Considerations . . . . . . . . . . . . . . . . . . . 5
8. Informative References . . . . . . . . . . . . . . . . . . . 5
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 7
1. Brief History of URL standards
This section contains a very compressed history of URL standards, in
sufficient detail to set some context.
The first standards-track specification for URLs was [RFC1738] in
1994. (That spec contains more background material.) It defined
URLs as ASCII only. Although it was quickly determined that it was
desirable to allow non-ASCII characters, shoehorning utf-8 into
ASCII-only systems was unacceptable; at the time Unicode was not so
widely deployed. The tack was taken to leave "URI" alone and define
a new protocol element, "IRI"; [RFC3987] was published in 2005 (in
sync with the [RFC3986] update to the URI definition).
The IRI-to-URI transformation specified in [RFC3987] had options; it
wasn't a deterministic path. The URI-to-IRI transformation was also
heuristic, since there was no guarantee that %xx-encoded bytes in the
URI were actually meant to be %xx percent-hex-encoded bytes of a utf8
encoding of a Unicode string.
To address issues and to fix URL for HTML5, a new IRI working group
<https://tools.ietf.org/wg/iri/charters [1]> was established in IETF
in 2009. Despite years of development, the IRI group was closed in
2014, with the consolation that the documents that were being
developed in the IRI working group could be updated as individual
Ruby & Masinter Expires June 20, 2015 [Page 2]
Internet-Draft Problem Statement: URL December 2014
submissions or within the "applications area" working group. In
particular, one of the IRI working group items was to update
[appsawg-uri-scheme-reg], which is currently under development in
IETF's application area.
Independently, the HTML specifications in the WHATWG and W3C
redefined "URL" in an attempt to match what some of the browsers were
doing. This definition was moved out into the "URL - Living
Standard" [URL-LS] .
The world has also moved on. ICANN has approved non-ASCII top level
domains, but IDNA specs ([RFC3490] and [RFC5895]) did not fully
addressed IRI processing. Subsequently, the Unicode consortium
produced [UTS-46].
2. Current Organizations and Specs in Development
There are multiple umbrella organizations which have produced
multiple documents, and it's unclear whether there's a trajectory to
make them consistent. This section tries to enumerate currently
active organizations and specs.
Organizations include the IETF [2], the WHATWG [3], the W3C [4], Web
Platform.org [5], and the Unicode Consortium [6]. Relevant specs
under development in each organization include:
2.1. IETF
[appsawg-uri-scheme-reg] and [kerwin-file-scheme] are under active
development.
The IRI working group closed, but work can continue in the
Applications Area working group. Documents sitting needing update,
abandoned now, are three drafts ([iri-3987bis], [iri-comparison], and
[iri-bidi-guidelines]), which were originally intended to obsolete
[RFC3987].
In addition, there's quite a bit of activity around URNs and library
identifiers in the URN working group, including some expressions of
desire to update RFC 3986 to better accomodate desired URN semantics.
2.2. WHATWG
The [URL-LS] is being developed as a living standard [7]. It
primarily focuses on specifying what is important for browsers. The
means by which new schemes might be registered is not yet defined.
This work is based on [UTS-46], and is intented to obsolete both
[RFC3986] and [RFC3987].
Ruby & Masinter Expires June 20, 2015 [Page 3]
Internet-Draft Problem Statement: URL December 2014
2.3. W3C
The Web Applications Working Group [8], in conjuction with the W3C
TAG [9], sporadically have been republishing the WHATWG work with no
technical content differences as [W3C-URL]. There is a
[url-workmode] proposal to formalize this relationship.
2.4. WebPlatform
[WP-URL] is being developed on a develop [10] GitHub branch based on
[URL-LS]. It currently contains work that has yet to be folded back
into the [URL-LS], primarily to rewrite the parser logic in a way
that is more understandable and approachable. The intent is to merge
this work once it is ready, and to actively work to keep the two
versions in sync.
2.5. Unicode Consortium
[UTS-46] defines parameterized functions for mapping domain names.
[URL-LS] builds upon this work, specifying particular values to be
used for these parameters.
3. Problem Statements
The main problem is conflicting specifications that overlap but don't
match each other.
Additionally, the following are issues that need to be resolves to
make URL processing unambiguous and stable.
o Nomenclature: over the years, a number of different sets of
terminology has been used. URL / URI / IRI is not the only
difference. [tantek-slice] chronicles a number of differences.
o Parameterization: standards in this area need to define such
matters as normalization forms and values for parameters such as
UseSTD3ASCIIRules.
o Interoperability: even after accounting for the above, there is a
demonstrable lack of interoperability across popular libraries and
browsers. [whatwg-interop] identifies a number of such
differences.
o Specific scheme definitions: some UR* scheme definitions are
woefully out of date, incomplete, or don't correspond to current
practice, but updating their definitions is unclear. This
includes "file:", for which there is a current effort, but there
are others which need review (including 'ftp:', 'data').
Ruby & Masinter Expires June 20, 2015 [Page 4]
Internet-Draft Problem Statement: URL December 2014
4. Outline of Potential Solution
This problem clearly requires a cross-organizational solution,
specifically:
o Build a plan to update or obsolete [RFC3986], [RFC3987],
[RFC5895], and [kerwin-file-scheme] to be consistent with [URL-LS]
and [UTS-46]. This may involve working to get the other
specifications updated, if only to clarify nomenclature.
o Change the [URL-LS] goals to only obsolete specifications listed
above that are not updated. Presuming that [RFC3986] is updated,
explicitly state that canonical URLs (i.e., the outout of the URL
parser) not only round trip, but also are valid URIs.
o Reconcile how [appsawg-uri-scheme-reg] and [URL-LS] handle
currently unknown schemes, update [appsawg-uri-scheme-reg] to
state that registration applies to both URIs and URLs, and update
[URL-LS] to indicate that [appsawg-uri-scheme-reg] is how you
register schemes.
o Have the W3C adopt [url-workmode].
o Other than responsing to any feedback that may be provided, no
changes to any Unicode Consortium product is required.
5. Acknowledgements
Helpful comments and improvements to this document have come from
Anne van Kesteren and Graham Klyne.
6. IANA Considerations
This memo currently includes no request to IANA, although an updated
[appsawg-uri-scheme-reg] might add some additional requirements and
information to IANA URI scheme registry [11] to make clear that the
schemes serve as URL schemes and IRI schemes as well as URI schemes.
7. Security Considerations
In addition to the security exposures created when URLs work
differently in different systems, all of the security considerations
defined in [RFC3490], [RFC3986], [RFC3987], and [RFC5895] apply to
URLs.
8. Informative References
Ruby & Masinter Expires June 20, 2015 [Page 5]
Internet-Draft Problem Statement: URL December 2014
[RFC1738] Berners-Lee, T., Masinter, L., and M. McCahill, "Uniform
Resource Locators (URL)", RFC 1738, December 1994.
[RFC3490] Faltstrom, P., Hoffman, P., and A. Costello,
"Internationalizing Domain Names in Applications (IDNA)",
RFC 3490, March 2003.
[RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC
Text on Security Considerations", BCP 72, RFC 3552, July
2003.
[RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
Resource Identifier (URI): Generic Syntax", STD 66, RFC
3986, January 2005.
[RFC3987] Duerst, M. and M. Suignard, "Internationalized Resource
Identifiers (IRIs)", RFC 3987, January 2005.
[RFC5895] Resnick, P. and P. Hoffman, "Mapping Characters for
Internationalized Domain Names in Applications (IDNA)
2008", RFC 5895, September 2010.
[URL-LS] van Kesteren, A. and S. Ruby, "URL Living Standard", 2014,
<https://url.spec.whatwg.org/>.
[UTS-46] Davis, M. and M. Suignard, "Unicode IDNA Compatibility
Processing", 2014, <http://unicode.org/reports/tr46/>.
[W3C-URL] van Kesteren, A. and S. Ruby, "URL Working Draft", 2014,
<http://www.w3.org/TR/url/>.
[WP-URL] van Kesteren, A. and S. Ruby, "URL Standard", 2014,
<https://specs.webplatform.org/url/webspecs/develop/>.
[appsawg-uri-scheme-reg]
Thaler, D., Hansen, T., Hardie, T., and L. Masinter,
"Guidelines and Registration Procedures for New URI
Schemes", 2014, <https://tools.ietf.org/html/draft-ietf-
appsawg-uri-scheme-reg>.
[iri-3987bis]
Duerst, M., Suignard, M., and L. Masinter,
"Internationalized Resource Identifiers (IRIs)", 2012,
<https://tools.ietf.org/html/draft-ietf-iri-3987bis-13>.
[iri-bidi-guidelines]
Duerst, M., Masinter, L., and A. Allawi, "Guidelines for
Internationalized Resource Identifiers with Bi-directional
Ruby & Masinter Expires June 20, 2015 [Page 6]
Internet-Draft Problem Statement: URL December 2014
Characters (Bidi IRIs)", 2012, <https://tools.ietf.org/
html/draft-ietf-iri-bidi-guidelines>.
[iri-comparison]
Masinter, L. and M. Duerst, "Comparison, Equivalence and
Canonicalization of Internationalized Resource
Identifiers", 2012, <https://tools.ietf.org/html/draft-
ietf-iri-comparison>.
[kerwin-file-scheme]
Kerwin, M., "The file URI Scheme", 2014, <https://
tools.ietf.org/html/draft-kerwin-file-scheme>.
[tantek-slice]
Celik, T., "How many ways can you slice a URL and name the
pieces?", 2011, <http://tantek.com/2011/238/b1/many-ways-
slice-url-name-pieces>.
[url-workmode]
Ruby, S., "URL WorkMode", 2014, <https://github.com/
webspecs/url/blob/develop/docs/workmode.md#preface>.
[whatwg-interop]
Ruby, S., "URL test results", 2014, <https://
url.spec.whatwg.org/interop/test-results/>.
Authors' Addresses
Sam Ruby (editor)
IBM
Raleigh
USA
Email: rubys@intertwingly.net
URI: http://intertwingly.net/
Larry Masinter
Adobe
345 Park Ave
San Jose, CA 95110
USA
Email: masinter@adobe.com
URI: http://larry.masinter.net/
Ruby & Masinter Expires June 20, 2015 [Page 7]