CONTENTS
[ Back ] [ Next ]
Click Here to visit the home page belonging to
the creator of DCP and get access to other downloads, including a
white paper on forward compatible
design for the Internet.
| |
PROTOCOL PARAMETERS
The implementation of DCP involves various protocol parameters. These include protocol versions,
uniform resource identifiers, character sets, content codings, media types and product tokens. They will be described in this section.
|
|
|
DCP uses a "<major>.<minor>" numbering scheme to indicate
versions of the protocol. The protocol versioning policy is intended to allow the sender
to indicate the format of a message and its capacity for understanding further DCP
communication, rather than the features obtained via that communication. No change is made
to the version number for the addition of message components which do not affect
communication behavior or which only add to extensible field values. The <minor>
number is incremented when the changes made to the protocol add features which do not
change the general message parsing algorithm, but which may add to the message semantics
and imply additional capabilities of the sender. The <major> number is incremented
when the format of a message within the protocol is changed.
The version of a DCP message is indicated by a dcpVersion
field in the first line of
the message. If the protocol version is not specified, the recipient must assume that the
message is in the DCP/1.0 format.
Syntax
Note that the major and minor numbers should be treated as separate integers and that
each may be incremented higher than a single digit. Thus, DCP/2.4 is a lower version than
DCP/2.13, which in turn is lower than DCP/12.3. Leading zeros should be ignored by
recipients and never generated by senders.
This reference defines the 1.0 version of the DCP protocol. DCP applications must include a
dcpVersion of "DCP/1.0".
DCP/1.0 targets must:
o recognize the format of the requestLine for DCP/1.0 requests;
o understand any valid request in the format of DCP/1.0;
o respond appropriately with a message in the same protocol version used by the
initiator if a response is required.
DCP/1.0 initiators must:
o recognize the format of the statusLine for DCP/1.0 responses;
o understand any valid response in the format of DCP/1.0.
Proxy and gateway applications must be careful in forwarding requests that are received
in a format different than that of the application's native DCP version. Since the
protocol version indicates the protocol capability of the sender, a proxy/gateway must
never send a message with a version indicator that is greater than its native version. If
a higher version request is received, the proxy/gateway must either downgrade the request
version or respond with an error. Requests with a version lower than that of the
application's native format may be upgraded before being forwarded; the proxy/gateway's
response to that request must follow the server requirements listed above.
|
|
|
URIs have been known by many names: WWW addresses, Universal Document Identifiers,
Universal Resource Identifiers, and finally the combination of Uniform Resource Locators
(URL) and Names (URN). URIs are simply
formatted strings which identify, via name, location, object instance, property, method,
event or any other characteristic, a network resource. For Internet access
protocols, it is necessary in most cases to define the encoding of the access
algorithm into something concise enough to be termed address. URIs which refer
to objects accessed via a specific Internet protocol, such as DCP, are known as
Uniform Resource Locators (URLs).
|
|
URIs in DCP can be represented in absolute form or relative to some known base URI,
depending upon the context of their use. The two forms are differentiated by the fact that
absolute URIs always begin with a scheme name followed by a colon.
Syntax
For definitive information on URL syntax and semantics, see RFC 1738 and RFC 1808. The
BNF above includes national characters not allowed in valid URLs as specified by RFC 1738,
since DCP resources are not restricted in the set of unreserved
characters allowed to
represent the relPath part of addresses, and DCP proxies may receive requests for URIs not
defined by RFC 1738.
|
|
The "DCP" scheme is used to locate network resources via the DCP protocol.
This section defines the scheme-specific syntax and semantics for DCP URLs.
Syntax
If the port is empty or not given, port 2500 is assumed. The semantics are that the
identified resource is located at the target listening for TCP connections and UDP
messages on that port of that host, and the dcpRequestUri
for the resource is absDcpPath. If
the absDcpPath is not present in the URL, it must be given as "/" when used as a
dcpRequestUri.
Note: Although the DCP protocol is independent of the transport layer protocol, the DCP
URL only identifies resources by their TCP location, and thus non-TCP resources must be
identified by some other URI scheme.
The canonical form for "DCP" URLs is obtained by converting any upAlpha
characters in host to their loAlpha
equivalent (hostnames are case-insensitive), eliding
the [ ":" port] if the port is 2500, and replacing an empty
absDcpPath with
"/".
|
|
DCP/1.0 applications allow three different formats for the representation of date/time
stamps: RFC 1123 Date, RFC 850 Date and ANSI C Time Date.
Examples
Sun, 06 Nov 1994 08:49:37 GMT
Note: RFC 1123 Format
|
Sunday, 06-Nov-94 08:49:37 GMT
Note: RFC 1036 Format
|
Sun Nov 6 08:49:37 1994
Note: ANSI C (asctime) Format |
The first format is preferred as an Internet standard and represents a fixed-length
subset of that defined by RFC 1123. The second format is in
common use, but is based on the obsolete RFC 850 date format and lacks a four-digit
year. DCP/1.0 initiators and targets that parse the date value should accept all three
formats, though they must never generate the third (ANSI C) format.
Note: Recipients of date values are encouraged to be robust in accepting date values
that may have been generated by non-DCP applications.All DCP/1.0 date/time stamps must be represented in Universal Time (UT), also known as
Greenwich Mean Time (GMT), without exception. This is indicated in the first two formats
by the inclusion of "GMT" as the three-letter abbreviation for time zone, and
should be assumed when reading the ANSI C format.
Syntax
Note: DCP requirements for the date/time stamp format apply only to their usage within
the protocol stream. Initiators and targets are not required to use these formats for user
presentation, request logging, etc. |
|
|
DCP uses the same definition of the term "character set" as that described
for MIME. The term "character set" is used in this reference to refer to a method used
with one or more tables to convert a sequence of octets into a sequence of characters.
Note that unconditional conversion in the other direction is not required, in that not all
characters may be available in a given character set and a character set may provide more
than one sequence of octets to represent a particular character. This definition is
intended to allow various types of character encodings, from simple single-table mappings
such as US-ASCII to complex table switching methods such as those that use ISO 2022's
techniques. However, the definition associated with a MIME character set name must fully
specify the mapping to be performed from octets to characters. In particular, use of
external profiling information to determine the exact mapping is not permitted.
Note: This use of the term "character set" is more commonly referred to as a
"character encoding." However, DCP shares the same terminology
as MIME.
DCP character sets are identified by case-insensitive tokens. The IANA Character Set
registry defines the complete set of tokens. However, because that registry does not
define a single, consistent token for each character set, the DCP preferred
names for those character sets most likely to be used with DCP entities are
defined here. These character
sets include those registered by RFC 1521 (the US-ASCII and ISO-8859 character sets)
and other names specifically recommended for use within MIME charSet parameters.
Syntax
| charSet |
= |
"US-ASCII" | "ISO-8859-1" |
"ISO-8859-2" | "ISO-8859-3" | "ISO-8859-4" |
"ISO-8859-5" | "ISO-8859-6" | "ISO-8859-7" |
"ISO-8859-8" | "ISO-8859-9" | "ISO-2022-JP" |
"ISO-2022-JP-2" | "ISO-2022-KR" | "UNICODE-1-1" |
"UNICODE-1-1-UTF-7" | "UNICODE-1-1-UTF-8" | token |
Although DCP allows an arbitrary token to be used as a charSet
value, any token that
has a predefined value within the IANA Character Set registry must represent the character
set defined by that registry. Applications should limit their use of character sets to
those defined by the IANA registry.
The character set of an entity body should be labeled as the lowest common denominator
of the character codes used within that body, with the exception that no label is
preferred over the labels US-ASCII or ISO-8859-1.
|
|
|
Content coding values are used to indicate an encoding transformation that has been
applied to a resource data. Content codings are primarily used to allow information to be
compressed or encrypted without losing the identity of its underlying media type.
Syntax
Note: DCP applications should consider "gzip" and
"compress" to be equivalent to "x-gzip" and "x-compress",
respectively.
All content-coding values are case-insensitive. DCP uses contentCoding values in the
Content-Encoding header field. Although the value describes the content-coding, what is
more important is that it indicates what decoding mechanism will be required to remove the
encoding. Note that a single program may be capable of decoding multiple content-coding
formats. Values specified by this reference are defined below.
Definitions
| x-gzip - an encoding format produced by the file compression program "gzip" (GNU zip)
developed by Jean-loup Gailly. This format is typically a Lempel-Ziv coding (LZ77) with a
32 bit CRC.
|
| x-compress - The encoding format produced by the file compression program "compress". This
format is an adaptive Lempel-Ziv-Welch coding (LZW).
|
Note: Use of program names for the identification of encoding formats is not desirable
and should be discouraged for future encodings. Their use here is representative of
historical practice, not good design.
|
|
|
DCP uses Internet Media Types in the Content-Type header field in order to provide open
and extensible data typing.
Syntax
The mainType, subType, and
pAttribute terms are case-insensitive.
The pValue term may or may not be case-sensitive, depending on the semantics of the
parameter name. lws
must not be generated between the mainType
and subType, or between a pAttribute
and its pValue.
Upon receipt of a mediaType with an unrecognized
pAttribute, the mediaType
should be treated as if the unrecognized pAttribute
and its pValue were not present.
Note: Media type values are registered with the Internet Assigned Number Authority (IANA).
The media type registration process is outlined in RFC 1590. Use of non-registered media
types is discouraged. |
CANONICALIZATION
Internet media types are registered with a canonical form. In general, an entity
body
transferred via DCP must be represented in the appropriate canonical form prior to its
transmission. If the body has been encoded with a content
encoding, the underlying data
should be in canonical form prior to being encoded.
Media subTypes of the "text" type use
crlf as the text line break when in
canonical form. However, DCP allows the transport of text media with plain cr
or lf alone
representing a line break when used consistently within the entity
body. DCP applications
must accept crlf, bare cr, and bare
lf as being representative of a line break in text
media received via DCP.
In addition, if the text media is represented in a character set that does not use
octets 13 and 10 for cr and lf
respectively, as is the case for some multi-byte character
sets, DCP allows the use of whatever octet
sequences are defined by that character set to
represent the equivalent of cr and
lf for line breaks. This flexibility regarding line
breaks applies only to text media in the entity body; a bare
cr or lf
should not be
substituted for crlf within any of the DCP control structures (such as
header fields and
multipart boundaries).
The "charset" parameter is used with some media types to define the character
set of the data. When the sender provides no explicit charset
parameter,
media subTypes of the "text" type are defined to have a default
charset value of
"ISO-8859-1" when received via DCP. Data in character sets other than
"ISO-8859-1" or its subsets must be labeled with an appropriate
charset
value in
order to be consistently interpreted by the recipient.
|
|
MIME provides for a number of "multipart" types (encapsulations of several
entities within a single message's entity body).
The multipart types registered by IANA do
not have any special meaning for DCP, though resources may need to understand each type
in order to correctly interpret the purpose of each body part. DCP resources should
follow the same or similar behavior as a MIME user agent does upon receipt of a multipart
type. DCP resources should not assume that other DCP resources are prepared to handle multipart
types.
All multipart types share a common syntax and must include a boundary
parameter as part
of the mediaType value. The message body is itself a protocol element and must therefore
use only crlf to represent line breaks between
body parts. Multipart body parts may
contain DCP header fields which are significant to the meaning of that part.
|
|
|
Product tokens are used to allow communicating resources to identify themselves via
a simple product token, with an optional slash and version (or model) designator. Most fields using
product tokens also allow sub-products which form a significant part of the application to
be listed, separated by white space. By convention, the products are listed in order of
their significance for identifying the resource.
Syntax
Examples
Application: DCP-Browser/1.1
Device: ADT-Security-2001/1.7
Product tokens should be short and to the point (use of them for advertising or other
non-essential information is explicitly forbidden). Although any token character may appear
in a productVersion, this token
should only be used for a version identifier (i.e.,
successive versions of the same product should only differ in the productVersion
portion
of the product value). |
|