home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Internet Info 1997 December
/
Internet_Info_CD-ROM_Walnut_Creek_December_1997.iso
/
drafts
/
draft_s_z
/
draft-stevens-advanced-api-02.txt
< prev
next >
Wrap
Text File
|
1997-03-26
|
147KB
|
3,753 lines
INTERNET-DRAFT W. Richard Stevens (Consultant)
Expires: September 26, 1997 Matt Thomas (AltaVista)
March 26, 1997
Advanced Sockets API for IPv6
<draft-stevens-advanced-api-02.txt>
Abstract
Specifications are in progress for changes to the sockets API to
support IP version 6 [2]. These changes are for TCP and UDP-based
applications and will support most end-user applications in use
today: Telnet and FTP clients and servers, HTTP clients and servers,
and the like.
But another class of applications exists that will also be run under
IPv6. We call these "advanced" applications and today this includes
programs such as Ping, Traceroute, routing daemons, multicast routing
daemons, router discovery daemons, and the like. The API feature
typically used by these programs that make them "advanced" is a raw
socket to access ICMPv4, IGMPv4, or IPv4, along with some knowledge
of the packet header formats used by these protocols. To provide
portability for applications that use raw sockets under IPv6, some
standardization is needed for the advanced API features.
There are other features of IPv6 that some applications will need to
access: interface identification (specifying the outgoing interface
and determining the incoming interface) and IPv6 extension headers
that are not addressed in [2]: Hop-by-Hop options, Destination
options, and the Routing header (source routing). This document
provides API access to these features too.
Status of this Memo
This document is an Internet Draft. Internet Drafts are working
documents of the Internet Engineering Task Force (IETF), its Areas,
and its Working Groups. Note that other groups may also distribute
working documents as Internet Drafts.
Internet Drafts are draft documents valid for a maximum of six
months. Internet Drafts may be updated, replaced, or obsoleted by
other documents at any time. It is not appropriate to use Internet
Drafts as reference material or to cite them other than as a "working
draft" or "work in progress".
Stevens & Thomas [Page 1]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
To learn the current status of any Internet-Draft, please check the
"1id-abstracts.txt" listing contained in the internet-drafts Shadow
Directories on: ftp.is.co.za (Africa), nic.nordu.net (Europe),
ds.internic.net (US East Coast), ftp.isi.edu (US West Coast), and
munnari.oz.au (Pacific Rim).
Stevens & Thomas [Page 2]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
Table of Contents
1. Introduction .................................................... 5
2. Common Structures and Definitions ............................... 6
2.1. The ip6_hdr Structure ...................................... 6
2.1.1. IPv6 Next Header Values ............................. 7
2.2. The icmp6_hdr Structure .................................... 7
2.2.1. ICMPv6 Type and Code Values ......................... 8
2.2.2. ICMPv6 Neighbor Discovery Type and Code Values ...... 9
2.3. Address Testing Functions .................................. 11
2.4. Protocols File ............................................. 12
3. IPv6 Raw Sockets ................................................ 12
3.1. Checksums .................................................. 13
3.2. ICMPv6 Type Filtering ...................................... 13
4. Ancillary Data .................................................. 16
4.1. The msghdr Structure ....................................... 17
4.2. The cmsghdr Structure ...................................... 18
4.3. Ancillary Data Object Functions ............................ 19
4.3.1. CMSG_FIRSTHDR ....................................... 20
4.3.2. CMSG_NXTHDR ......................................... 20
4.3.3. CMSG_DATA ........................................... 21
4.3.4. CMSG_SPACE .......................................... 22
4.3.5. CMSG_LEN ............................................ 22
4.4. Summary of Options Described Using Ancillary Data .......... 22
4.5. TCP Access to Ancillary Data ............................... 24
5. Packet Information .............................................. 25
5.1. Specifying/Receiving the Interface ......................... 26
5.2. Specifying/Receiving Source/Destination Address ............ 27
5.3. Specifying/Receiving the Hop Limit ......................... 27
5.4. Specifying the Next Hop Address ............................ 28
5.5. Additional Errors with sendmsg() ........................... 28
6. Flow Labels ..................................................... 29
6.1. inet6_flow_assign .......................................... 31
6.2. inet6_flow_free ............................................ 32
6.3. inet6_flow_reuse ........................................... 32
7. Hop-By-Hop Options .............................................. 33
7.1. Receiving Hop-by-Hop Options ............................... 34
7.2. Sending Hop-by-Hop Options ................................. 35
7.3. Hop-by-Hop and Destination Options Processing .............. 35
7.3.1. inet6_option_space .................................. 35
7.3.2. inet6_option_init ................................... 36
7.3.3. inet6_option_append ................................. 36
Stevens & Thomas [Page 3]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
7.3.4. inet6_option_alloc .................................. 37
7.3.5. inet6_option_next ................................... 38
7.3.6. inet6_option_find ................................... 38
7.3.7. Options Examples .................................... 39
8. Destination Options ............................................. 46
8.1. Receiving Destination Options .............................. 46
8.2. Sending Destination Options ................................ 47
9. Source Route Option ............................................. 47
9.1. inet6_srcrt_space .......................................... 48
9.2. inet6_srcrt_init ........................................... 49
9.3. inet6_srcrt_add ............................................ 49
9.4. inet6_srcrt_lasthop ........................................ 50
9.5. inet6_srcrt_reverse ........................................ 50
9.6. inet6_srcrt_segments ....................................... 50
9.7. inet6_srcrt_getaddr ........................................ 51
9.8. inet6_srcrt_getflags ....................................... 51
9.9. Source Route Example ....................................... 51
10. Ordering of Ancillary Data and IPv6 Extension Headers ........... 56
11. IPv6-Specific Options with IPv4-Mapped IPv6 Addresses ........... 58
12. rresvport_af .................................................... 58
13. Future Items .................................................... 59
13.1. Path MTU Discovery and UDP ................................ 59
13.2. Neighbor Reachability and UDP ............................. 59
14. Summary of New Definitions ...................................... 60
15. Security Considerations ......................................... 63
16. Change History .................................................. 63
17. References ...................................................... 66
18. Acknowledgments ................................................. 66
19. Authors' Addresses .............................................. 66
Stevens & Thomas [Page 4]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
1. Introduction
Specifications are in progress for changes to the sockets API to
support IP version 6 [2]. These changes are for TCP and UDP-based
applications. The current document defines some the "advanced"
features of the sockets API that are required for applications to
take advantage of additional features of IPv6.
Today, the portability of applications using IPv4 raw sockets is
quite high, but this is mainly because most IPv4 implementations
started from a common base (the Berkeley source code) or at least
started with the Berkeley headers. This allows programs such as Ping
and Traceroute, for example, to compile with minimal effort on many
hosts that support the sockets API. With IPv6, however, there is no
common source code base that implementors are starting from, and the
possibility for divergence at this level between different
implementations is high. To avoid a complete lack of portability
amongst applications that use raw IPv6 sockets, some standardization
is necessary.
There are also features from the basic IPv6 specification that are
not addressed in [2]: sending and receiving Hop-by-Hop options,
Destination options, and Routing headers, specifying the outgoing
interface, and being told of the receiving interface.
This document can be divided into the following main sections.
1. Definitions of the basic constants and structures required for
applications to use raw IPv6 sockets. This includes structure
definitions for the IPv6 and ICMPv6 headers and all associated
constants (e.g., values for the Next Header field).
2. Some basic semantic definitions for IPv6 raw sockets. For
example, a raw ICMPv4 socket requires the application to
calculate and store the ICMPv4 header checksum. But with IPv6
this would require the application to choose the source IPv6
address because the source address is part of the pseudo header
that ICMPv6 now uses for its checksum computation. It should be
defined that with a raw ICMPv6 socket the kernel always
calculates and stores the ICMPv6 header checksum.
3. Packet information: how applications can obtain the received
interface, destination address, and received hop limit, along
with specifying these values on a per-packet basis. There are a
class of applications that need this capability and the technique
should be portable.
4. Access to the optional Hop-by-Hop, Destination, and Routing
Stevens & Thomas [Page 5]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
headers.
5. Additional features required for IPv6 application portability.
The packet information along with access to the extension headers
(Hop-by-Hop options, Destination options, and Routing header) are
specified using the "ancillary data" fields that were added to the
4.3BSD Reno sockets API in 1990. The reason is that these ancillary
data fields are part of the Posix.1g standard (which should be
approved in 1997) and should therefore be adopted by most vendors.
This document does not address application access to either the
authentication header or the encapsulating security payload header.
All examples in this document omit error checking in favor of brevity
and clarity.
We note that many of the functions and socket options defined in this
document may have error returns that are not defined in this
document. Many of these possible error returns will be recognized
only as implementations proceed.
Datatypes in this document follow the Posix.1g format: u_intN_t means
an unsigned integer of exactly N bits (e.g., u_int16_t) and u_intNm_t
means an unsigned integer of at least N bits (e.g., u_int32m_t).
Note that we use the (unofficial) terminology ICMPv4, IGMPv4, and
ARPv4 to avoid any confusion with the newer ICMPv6 protocol.
2. Common Structures and Definitions
Many advanced applications examine fields in the IPv6 header and set
and examine fields in the various ICMPv6 headers. Common structure
definitions for these headers are required, along with common
constant definitions for the structure members.
When an include file is specified, that include file is allowed to
include other files that do the actual declaration or definition.
2.1. The ip6_hdr Structure
The following structure is defined as a result of including
<netinet/ip6.h>. Note that this is a new header.
Stevens & Thomas [Page 6]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
struct ip6_hdr {
union {
struct ip6_hdrctl {
u_int32_t ctl6_flow; /* 24 bits of flow-ID */
u_int16_t ctl6_plen; /* payload length */
u_int8_t ctl6_nxt; /* next header */
u_int8_t ctl6_hlim; /* hop limit */
} un_ctl6;
u_int8_t un_vfc; /* 4 bits version, 4 bits priority */
} ip6_ctlun;
struct in6_addr ip6_src; /* source address */
struct in6_addr ip6_dst; /* destination address */
};
#define ip6_vfc ip6_ctlun.un_vfc
#define ip6_flow ip6_ctlun.un_ctl6.ctl6_flow
#define ip6_plen ip6_ctlun.un_ctl6.ctl6_plen
#define ip6_nxt ip6_ctlun.un_ctl6.ctl6_nxt
#define ip6_hlim ip6_ctlun.un_ctl6.ctl6_hlim
#define ip6_hops ip6_ctlun.un_ctl6.ctl6_hlim
2.1.1. IPv6 Next Header Values
IPv6 defines many new values for the Next Header field. The
following constants are defined as a result of including
<netinet/in.h>.
#define IPPROTO_HOPOPTS 0 /* IPv6 Hop-by-Hop options */
#define IPPROTO_IPV6 41 /* IPv6 header */
#define IPPROTO_ROUTING 43 /* IPv6 Routing header */
#define IPPROTO_FRAGMENT 44 /* IPv6 fragmentation header */
#define IPPROTO_ESP 50 /* encapsulating security payload */
#define IPPROTO_AH 51 /* authentication header */
#define IPPROTO_ICMPV6 58 /* ICMPv6 */
#define IPPROTO_NONE 59 /* IPv6 no next header */
#define IPPROTO_DSTOPTS 60 /* IPv6 Destination options */
Berkeley-derived IPv4 implementations also define IPPROTO_IP to be 0.
This should not be a problem since IPPROTO_IP is used only with IPv4
sockets and IPPROTO_HOPOPTS only with IPv6 sockets.
2.2. The icmp6_hdr Structure
The ICMPv6 header is needed by numerous IPv6 applications including
Stevens & Thomas [Page 7]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
Ping, Traceroute, router discovery daemons, and neighbor discovery
daemons. The following structure is defined as a result of including
<netinet/icmp6.h>. Note that this is a new header.
struct icmp6_hdr {
u_int8_t icmp6_type; /* type field */
u_int8_t icmp6_code; /* code field */
u_int16_t icmp6_cksum; /* checksum field */
union {
u_int32_t icmp6_un_data32[1]; /* type-specific field */
u_int16_t icmp6_un_data16[2]; /* type-specific field */
u_int8_t icmp6_un_data8[4]; /* type-specific field */
} icmp6_dataun;
};
#define icmp6_data32 icmp6_dataun.icmp6_un_data32
#define icmp6_data16 icmp6_dataun.icmp6_un_data16
#define icmp6_data8 icmp6_dataun.icmp6_un_data8
#define icmp6_pptr icmp6_data32[0] /* parameter prob */
#define icmp6_mtu icmp6_data32[0] /* packet too big */
#define icmp6_id icmp6_data16[0] /* echo request/reply */
#define icmp6_seq icmp6_data16[1] /* echo request/reply */
#define icmp6_maxdelay icmp6_data16[0] /* mcast group membership */
2.2.1. ICMPv6 Type and Code Values
In addition to a common structure for the ICMPv6 header, common
definitions are required for the ICMPv6 type and code fields. The
following constants are also defined as a result of including
<netinet/icmp6.h>.
Stevens & Thomas [Page 8]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
#define ICMPV6_DEST_UNREACH 1
#define ICMPV6_PACKET_TOOBIG 2
#define ICMPV6_TIME_EXCEEDED 3
#define ICMPV6_PARAMPROB 4
#define ICMPV6_INFOMSG_MASK 0x80 /* all informational messages */
#define ICMPV6_ECHOREQUEST 128
#define ICMPV6_ECHOREPLY 129
#define ICMPV6_MGM_QUERY 130
#define ICMPV6_MGM_REPORT 131
#define ICMPV6_MGM_REDUCTION 132
#define ICMPV6_DEST_UNREACH_NOROUTE 0 /* no route to destination */
#define ICMPV6_DEST_UNREACH_ADMIN 1 /* communication with destination */
/* administratively prohibited */
#define ICMPV6_DEST_UNREACH_NOTNEIGHBOR 2 /* not a neighbor */
#define ICMPV6_DEST_UNREACH_ADDR 3 /* address unreachable */
#define ICMPV6_DEST_UNREACH_NOPORT 4 /* bad port */
#define ICMPV6_TIME_EXCEED_HOPS 0 /* Hop Limit == 0 in transit */
#define ICMPV6_TIME_EXCEED_REASSEMBLY 1 /* Reassembly time out */
#define ICMPV6_PARAMPROB_HEADER 0 /* erroneous header field */
#define ICMPV6_PARAMPROB_NEXTHEADER 1 /* unrecognized Next Header */
#define ICMPV6_PARAMPROB_OPTION 2 /* unrecognized IPv6 option */
The five ICMP message types defined by IPv6 neighbor discovery
(133-137) are defined in the next section.
2.2.2. ICMPv6 Neighbor Discovery Type and Code Values
The following constants are defined as a result of including
<netinet/icmp6.h>.
Stevens & Thomas [Page 9]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
#define ND6_ROUTER_SOLICITATION 133
#define ND6_ROUTER_ADVERTISEMENT 134
#define ND6_NEIGHBOR_SOLICITATION 135
#define ND6_NEIGHBOR_ADVERTISEMENT 136
#define ND6_REDIRECT 137
enum nd6_option {
ND6_OPT_SOURCE_LINKADDR=1,
ND6_OPT_TARGET_LINKADDR=2,
ND6_OPT_PREFIX_INFORMATION=3,
ND6_OPT_REDIRECTED_HEADER=4,
ND6_OPT_MTU=5,
ND6_OPT_ENDOFLIST=256
};
struct nd6_router_solicit { /* router solicitation */
struct icmp6_hdr rsol_hdr;
};
#define rsol_type rsol_hdr.icmp6_type
#define rsol_code rsol_hdr.icmp6_code
#define rsol_cksum rsol_hdr.icmp6_cksum
#define rsol_reserved rsol_hdr.icmp6_data32[0]
struct nd6_router_advert { /* router advertisement */
struct icmp6_hdr radv_hdr;
u_int32_t radv_reachable; /* reachable time */
u_int32_t radv_retransmit; /* reachable retransmit time */
};
#define radv_type radv_hdr.icmp6_type
#define radv_code radv_hdr.icmp6_code
#define radv_cksum radv_hdr.icmp6_cksum
#define radv_maxhoplimit radv_hdr.icmp6_data8[0]
#define radv_m_o_res radv_hdr.icmp6_data8[1]
#define ND6_RADV_M_BIT 0x80
#define ND6_RADV_O_BIT 0x40
#define radv_router_lifetime radv_hdr.icmp6_data16[1]
struct nd6_nsolicitation { /* neighbor solicitation */
struct icmp6_hdr nsol6_hdr;
struct in6_addr nsol6_target;
};
struct nd6_nadvertisement { /* neighbor advertisement */
struct icmp6_hdr nadv6_hdr;
struct in6_addr nadv6_target;
Stevens & Thomas [Page 10]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
};
#define nadv6_flags nadv6_hdr.icmp6_data32[0]
#define ND6_NADVERFLAG_ISROUTER 0x80
#define ND6_NADVERFLAG_SOLICITED 0x40
#define ND6_NADVERFLAG_OVERRIDE 0x20
struct nd6_redirect { /* redirect */
struct icmp6_hdr redirect_hdr;
struct in6_addr redirect_target;
struct in6_addr redirect_destination;
};
struct nd6_opt_prefix_info { /* prefix information */
u_int8_t opt_type;
u_int8_t opt_length;
u_int8_t opt_prefix_length;
u_int8_t opt_l_a_res;
u_int32_t opt_valid_life;
u_int32_t opt_preferred_life;
u_int32_t opt_reserved2;
struct in6_addr opt_prefix;
};
#define ND6_OPT_PI_L_BIT 0x80
#define ND6_OPT_PI_A_BIT 0x40
struct nd6_opt_mtu { /* MTU option */
u_int8_t opt_type;
u_int8_t opt_length;
u_int16_t opt_reserved;
u_int32_t opt_mtu;
};
2.3. Address Testing Functions
The basic API ([2]) defines some functions for testing an IPv6
address for certain properties. This API extends those definitions
with additional address testing functions, defined as a result of
including <netinet/in.h>.
int IN6_ARE_ADDR_EQUAL(const struct in6_addr *,
const struct in6_addr *);
Stevens & Thomas [Page 11]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
2.4. Protocols File
Many hosts provide the file /etc/protocols that contains the names of
the various IP protocols and their protocol number (e.g., the value
of the protocol field in the IPv4 header for that protocol, such as 1
for ICMP). Some programs then call the function getprotobyname() to
obtain the protocol value that is then specified as the third
argument to the socket() function. For example, the Ping program
contains code of the form
struct protoent *proto;
proto = getprotobyname("icmp");
s = socket(AF_INET, SOCK_RAW, proto->p_proto);
Common names are required for the new IPv6 protocols in this file, to
provide portability of applications that call the getprotoXXX()
functions.
We define the two protocol names
ipv6
icmpv6
with values 41 and 58 (decimal), respectively.
3. IPv6 Raw Sockets
Raw sockets bypass the transport layer (TCP or UDP). With IPv4, raw
sockets are used to access ICMPv4, IGMPv4, and to read and write IPv4
datagrams containing a protocol field that the kernel does not
process. An example of the latter is a routing daemon for OSPF,
since it uses IPv4 protocol field 89. With IPv6 raw sockets will be
used for ICMPv6 and to read and write IPv6 datagrams containing a
Next Header field that the kernel does not process. Examples of the
latter are a routing daemon for OSPF for IPv6 and RSVP (protocol
field 46).
All data sent via raw sockets MUST be in network byte order and all
data received via raw sockets will be in network byte order. This
differs from the IPv4 raw sockets, which did not specify a byte
ordering and typically used the host's byte order.
Another difference from IPv4 raw sockets is that complete packets
(that is, IPv6 packets with extension headers) cannot be transferred
via the IPv6 raw sockets API. Instead, ancillary data objects are
Stevens & Thomas [Page 12]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
used to transfer the extension headers, as described later in this
document. Should an application need access to the complete IPv6
packet, some other technique, such as the datalink interfaces BPF or
DLPI, must be used.
All fields in the IPv6 header that an application might want to
change (i.e., everything other than the version number) can be
modified by the application. All fields in a received IPv6 header
(other than the version number and Next Header fields) and all
extension headers are also made available to the application. Hence
there is no need for a socket option similar to the IPv4 IP_HDRINCL
socket option.
When we say "an ICMPv6 raw socket" we mean a socket created by
calling the socket function with the three arguments PF_INET6,
SOCK_RAW, and IPPROTO_ICMPV6.
3.1. Checksums
The kernel will calculate and insert the ICMPv6 checksum for ICMPv6
raw sockets, since this checksum is mandatory.
For other raw IPv6 sockets (that is, for raw IPv6 sockets created
with a third argument other than IPPROTO_ICMPV6), the application
must set the new IPV6_CHECKSUM socket option to have the kernel
compute and store a checksum. This option prevents applications from
having to perform source address selection on the packets they send.
The checksum will incorporate the IPv6 pseudo-header, defined in
Section 8.1 of [1]. This new socket option also specifies an integer
offset into the user data of where the checksum is to be placed.
int offset = 2;
setsockopt(fd, IPPROTO_IPV6, IPV6_CHECKSUM, &offset, sizeof(offset));
By default, this socket option is disabled, which means the kernel
will not calculate and store a checksum. If the offset is set to -1
this tells the kernel not to calculate and store a checksum.
(Note: Since the checksum is always calculated by the kernel for an
ICMPv6 socket, applications are not able to generate ICMPv6 packets
with incorrect checksums (presumably for testing purposes) using this
API.)
3.2. ICMPv6 Type Filtering
ICMPv4 raw sockets receive most ICMPv4 messages received by the
Stevens & Thomas [Page 13]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
kernel. (We say "most" and not "all" because Berkeley-derived
kernels never pass echo requests, timestamp requests, or address mask
requests to a raw socket. Instead these three messages are processed
entirely by the kernel.) But ICMPv6 is a superset of ICMPv4, also
including the functionality of IGMPv4 and ARPv4. This means that an
ICMPv6 raw socket can potentially receive many more messages than
would be received with an ICMPv4 raw socket: ICMP messages similar to
ICMPv4, along with neighbor solicitations, neighbor advertisements,
and the three group membership messages.
Most applications using an ICMPv6 raw socket care about only a small
subset of the ICMPv6 message types. To transfer extraneous ICMPv6
messages from the kernel to user can incur a significant overhead.
Therefore this API includes a method of filtering ICMPv6 messages by
the ICMPv6 type field.
Each ICMPv6 raw socket has an associated filter whose datatype is
defined as
struct icmp6_filter;
This structure, along with the functions and constants defined later
in this section, are defined as a result of including the
<netinet/icmp6.h> header.
The current filter is fetched and stored using getsockopt() and
setsockopt() with a level of IPPROTO_ICMPV6 and an option name of
ICMPV6_FILTER.
Six functions operate on an icmp6_filter structure:
void ICMPV6_FILTER_SETPASSALL (struct icmp6_filter *);
void ICMPV6_FILTER_SETBLOCKALL(struct icmp6_filter *);
void ICMPV6_FILTER_SETPASS ( int, struct icmp6_filter *);
void ICMPV6_FILTER_SETBLOCK( int, struct icmp6_filter *);
int ICMPV6_FILTER_WILLPASS (int, const struct icmp6_filter *);
int ICMPV6_FILTER_WILLBLOCK(int, const struct icmp6_filter *);
The first argument to the last four functions (an integer) is an
ICMPv6 message type, between 0 and 255. The pointer argument to all
six functions is a pointer to a filter that is modified by the first
four functions examined by the last two functions.
The first two functions, SETPASSALL and SETBLOCKALL, let us specify
that all ICMPv6 messages are passed to the application or that all
ICMPv6 messages are blocked from being passed to the application.
Stevens & Thomas [Page 14]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
The next two functions, SETPASS and SETBLOCK, let us specify that
messages of a given ICMPv6 type should be passed to the application
or not passed to the application (blocked).
The final two functions, WILLPASS and WILLBLOCK, return true or false
depending whether the specified message type is passed to the
application or blocked from being passed to the application by the
filter pointed to by the second argument.
When an ICMPv6 raw socket is created, it will by default pass all
ICMPv6 message types to the application.
As an example, a Ping program could execute the following:
struct icmp6_filter myfilt;
fd = socket(PF_INET6, SOCK_RAW, IPPROTO_ICMPV6);
ICMPV6_FILTER_SETBLOCKALL(&myfilt);
ICMPV6_FILTER_SETPASS(ICMPV6_ECHOREPLY, &myfilt);
setsockopt(fd, IPPROTO_ICMPV6, ICMPV6_FILTER, &myfilt, sizeof(myfilt));
The filter structure is declared and then initialized to block all
messages types. The filter structure is then changed to allow ICMPv6
echo reply messages to be passed to the application and the filter is
installed using setsockopt().
The icmp6_filter structure is similar to the fd_set datatype used
with the select() function in the sockets API. The icmp6_filter
structure is an opaque datatype and the application should not care
how it is implemented. All the application does with this datatype
is allocate a variable of this type, pass a pointer to a variable of
this type to getsockopt() and setsockopt(), and operate on a variable
of this type using the six functions that we just defined.
Nevertheless, it is worth showing a simple implementation of this
datatype and the six functions, which can be implemented as C macros.
Stevens & Thomas [Page 15]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
struct icmp6_filter {
u_int32m_t data[8]; /* 8*32 = 256 bits */
};
#define ICMPV6_FILTER_WILLPASS(type, filterp) \
((((filterp)->data[(type) >> 5]) & (1 << ((type) & 31))) != 0)
#define ICMPV6_FILTER_WILLBLOCK(type, filterp) \
((((filterp)->data[(type) >> 5]) & (1 << ((type) & 31))) == 0)
#define ICMPV6_FILTER_SETPASS(type, filterp) \
((((filterp)->data[(type) >> 5]) |= (1 << ((type) & 31))))
#define ICMPV6_FILTER_SETBLOCK(type, filterp) \
((((filterp)->data[(type) >> 5]) &= ~(1 << ((type) & 31))))
#define ICMPV6_FILTER_SETPASSALL(filterp) \
memset((filterp), 0xFF, sizeof(struct icmp6_filter))
#define ICMPV6_FILTER_SETBLOCKALL(filterp) \
memset((filterp), 0, sizeof(struct icmp6_filter))
(Note: These sample definitions have two limitations that an
implementation may want to change. The first four macros evaluate
their first argument two times. The second two macros require the
inclusion of the <string.h> header for the memset() function.)
4. Ancillary Data
4.2BSD allowed file descriptors to be transferred between separate
processes across a UNIX domain socket using the sendmsg() and
recvmsg() functions. Two members of the msghdr structure,
msg_accrights and msg_accrightslen, were used to send and receive the
descriptors. When the OSI protocols were added to 4.3BSD Reno in
1990 the names of these two fields in the msghdr structure were
changed to msg_control and msg_controllen, because they were used by
the OSI protocols for "control information", although the comments in
the source code call this "ancillary data".
Other than the OSI protocols, the use of ancillary data has been
rare. In 4.4BSD, for example, the only use of ancillary data with
IPv4 is to return the destination address of a received UDP datagram
if the IP_RECVDSTADDR socket option is set. With Unix domain sockets
ancillary data is still used to send and receive descriptors.
Nevertheless the ancillary data fields of the msghdr structure
provide a clean way to pass information in addition to the data that
is being read or written. The inclusion of the msg_control and
msg_controllen members of the msghdr structure along with the cmsghdr
structure that is pointed to by the msg_control member is required by
the Posix.1g sockets API standard (which should be completed during
Stevens & Thomas [Page 16]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
1997).
In this document ancillary data is used to exchange the following
optional information between the application and the kernel:
1. the send/receive interface and source/destination address,
2. the hop limit,
3. next hop address,
4. Hop-by-Hop options,
5. Destination options, and
6. Routing header.
Before describing these uses in detail, we review the definition of
the msghdr structure itself, the cmsghdr structure that defines an
ancillary data object, and some functions that operate on the
ancillary data objects.
4.1. The msghdr Structure
The msghdr structure is used by the recvmsg() and sendmsg()
functions. Its Posix.1g definition is:
struct msghdr {
void *msg_name; /* ptr to socket address structure */
size_t msg_namelen; /* size of socket address structure */
struct iovec *msg_iov; /* scatter/gather array */
size_t msg_iovlen; /* # elements in msg_iov */
void *msg_control; /* ancillary data */
size_t msg_controllen; /* ancillary data buffer length */
int msg_flags; /* flags on received message */
};
The structure is declared as a result of including <sys/socket.h>.
(Note: Before Posix.1g the two "void *" pointers were typically "char
*", and the three size_t members were typically integers. The change
in msg_control to a "void *" pointer affects any code that increments
this pointer.)
Most Berkeley-derived implementations limit the amount of ancillary
data in a call to sendmsg() to no more than 108 bytes (an mbuf).
This API requires a minimum of 10240 bytes of ancillary data, but it
is recommended that the amount be limited only by the buffer space
reserved by the socket (which can be modified by the SO_SNDBUF socket
option). (Note: This magic number 10240 was picked as a value that
should always be large enough. 108 bytes is clearly too small as the
Stevens & Thomas [Page 17]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
maximum size of a Type 0 Routing header is 376 bytes.)
4.2. The cmsghdr Structure
The cmsghdr structure describes ancillary data objects transferred by
recvmsg() and sendmsg(). Its Posix.1g definition is:
struct cmsghdr {
size_t cmsg_len; /* #bytes, including this header */
int cmsg_level; /* originating protocol */
int cmsg_type; /* protocol-specific type */
/* followed by unsigned char cmsg_data[]; */
};
This structure is declared as a result of including <sys/socket.h>.
As shown in this definition, normally there is no member with the
name cmsg_data[]. Instead, the data portion is accessed using the
CMSG_xxx() functions, as described shortly. Nevertheless, it is
common to refer to the cmsg_data[] member.
(Note: Before Posix.1g the cmsg_len member was an integer, and not a
size_t. On a 32-bit architecture this probably has no effect, but on
a 64-bit architecture this could change the size of this member from
4 bytes to 8 bytes and force 8 byte alignment for the structure.)
When ancillary data is sent or received, any number of ancillary data
objects can be specified by the msg_control and msg_controllen
members of the msghdr structure, because each object is preceded by a
cmsghdr structure defining the object's length (the cmsg_len member).
Historically Berkeley-derived implementations have passed only one
object at a time, but this API allows multiple objects to be passed
in a single call to sendmsg() or recvmsg(). The following example
shows two ancillary data objects in a control buffer.
Stevens & Thomas [Page 18]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
|<--------------------------- msg_controllen -------------------------->|
| |
|<----- ancillary data object ----->|<----- ancillary data object ----->|
|<---------- CMSG_SPACE() --------->|<---------- CMSG_SPACE() --------->|
| | |
|<---------- cmsg_len ---------->| |<--------- cmsg_len ----------->| |
|<--------- CMSG_LEN() --------->| |<-------- CMSG_LEN() ---------->| |
| | | | |
+-----+-----+-----+--+-----------+--+-----+-----+-----+--+-----------+--+
|cmsg_|cmsg_|cmsg_|XX| |XX|cmsg_|cmsg_|cmsg_|XX| |XX|
|len |level|type |XX|cmsg_data[]|XX|len |level|type |XX|cmsg_data[]|XX|
+-----+-----+-----+--+-----------+--+-----+-----+-----+--+-----------+--+
^
|
msg_control
points here
The fields shown as "XX" are possible padding, between the cmsghdr
structure and the data, and between the data and the next cmsghdr
structure, if required by the implementation.
4.3. Ancillary Data Object Functions
To aid in the manipulation of ancillary data objects, three functions
from 4.4BSD are defined by Posix.1g: CMSG_DATA(), CMSG_NXTHDR(), and
CMSG_FIRSTHDR(). Before describing these functions, we show the
following example of how they might be used with a call to recvmsg().
struct msghdr msg;
struct cmsghdr *cmsgptr;
/* fill in msg */
/* call recvmsg() */
for (cmsgptr = CMSG_FIRSTHDR(&msg); cmsgptr != NULL;
cmsgptr = CMSG_NXTHDR(&msg, cmsgptr)) {
if (cmsgptr->cmsg_level == ... && cmsgptr->cmsg_type == ... ) {
u_char *ptr;
ptr = CMSG_DATA(cmsgptr);
/* process data pointed to by ptr */
}
}
Stevens & Thomas [Page 19]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
We now describe the three Posix.1g functions, followed by two more
that are new with this API: CMSG_SPACE() and CMSG_LEN(). All these
functions are defined as a result of including <sys/socket.h>.
4.3.1. CMSG_FIRSTHDR
struct cmsghdr *CMSG_FIRSTHDR(const struct msghdr *mhdr);
CMSG_FIRSTHDR() returns a pointer to the first cmsghdr structure in
the msghdr structure pointed to by mhdr. The function returns NULL
if there is no ancillary data pointed to the by msghdr structure
(that is, if either msg_control is NULL or if msg_controllen is less
than the size of a cmsghdr structure).
One possible implementation could be
#define CMSG_FIRSTHDR(mhdr) \
( (mhdr)->msg_controllen >= sizeof(struct cmsghdr) ? \
(struct cmsghdr *)(mhdr)->msg_control : \
(struct cmsghdr *)NULL )
(Note: Most existing implementations do not test the value of
msg_controllen, and just return the value of msg_control. The value
of msg_controllen must be tested, because if the application asks
recvmsg() to return ancillary data, by setting msg_control to point
to the application's buffer and setting msg_controllen to the length
of this buffer, the kernel indicates that no ancillary data is
available by setting msg_controllen to 0 on return. It is also
easier to put this test into this macro, than making the application
perform the test.)
4.3.2. CMSG_NXTHDR
struct cmsghdr *CMSG_NXTHDR(const struct msghdr *mhdr,
const struct cmsghdr *cmsg);
CMSG_NXTHDR() returns a pointer to the cmsghdr structure describing
the next ancillary data object. mhdr is a pointer to a msghdr
structure and cmsg is a pointer to a cmsghdr structure. If there is
not another ancillary data object, the return value is NULL.
The following behavior of this function is new to this API: if the
value of the cmsg pointer is NULL, a pointer to the cmsghdr structure
describing the first ancillary data object is returned. That is,
Stevens & Thomas [Page 20]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
CMSG_NXTHDR(mhdr, NULL) is equivalent to CMSG_FIRSTHDR(mhdr). If
there are no ancillary data objects, the return value is NULL. This
provides an alternative way of coding the processing loop shown
earlier:
struct msghdr msg;
struct cmsghdr *cmsgptr = NULL;
/* fill in msg */
/* call recvmsg() */
while ((cmsgptr = CMSG_NXTHDR(&msg, cmsgptr)) != NULL) {
if (cmsgptr->cmsg_level == ... && cmsgptr->cmsg_type == ... ) {
u_char *ptr;
ptr = CMSG_DATA(cmsgptr);
/* process data pointed to by ptr */
}
}
One possible implementation could be:
#define CMSG_NXTHDR(mhdr, cmsg) \
( ((cmsg) == NULL) ? CMSG_FIRSTHDR(mhdr) : \
(((u_char *)(cmsg) + ALIGN((cmsg)->cmsg_len) \
+ ALIGN(sizeof(struct cmsghdr)) > \
(u_char *)((mhdr)->msg_control) + (mhdr)->msg_controllen) ? \
(struct cmsghdr *)NULL : \
(struct cmsghdr *)((u_char *)(cmsg) + ALIGN((cmsg)->cmsg_len))) )
The macro ALIGN(), which is implementation dependent, rounds its
argument up to the next even multiple of whatever alignment is
required (probably a multiple of 4 or 8 bytes).
4.3.3. CMSG_DATA
unsigned char *CMSG_DATA(const struct cmsghdr *cmsg);
CMSG_DATA() returns a pointer to the data (what is called the
cmsg_data[] member, even though such a member is not defined in the
structure) following a cmsghdr structure.
One possible implementation could be:
Stevens & Thomas [Page 21]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
#define CMSG_DATA(cmsg) ( (u_char *)(cmsg) + \
ALIGN(sizeof(struct cmsghdr)) )
4.3.4. CMSG_SPACE
unsigned int CMSG_SPACE(unsigned int length);
This function is new with this API. Given the length of an ancillary
data object, CMSG_SPACE() returns the space required by the object
and its cmsghdr structure, including any padding needed to satisfy
alignment requirements. This function can be used, for example, to
allocate space dynamically for the ancillary data. This function
should not be used to initialize the cmsg_len member of a cmsghdr
structure; instead use the CMSG_LEN() function.
One possible implementation could be:
#define CMSG_SPACE(length) ( ALIGN(sizeof(struct cmsghdr)) + \
ALIGN(length) )
4.3.5. CMSG_LEN
unsigned int CMSG_LEN(unsigned int length);
This function is new with this API. Given the length of an ancillary
data object, CMSG_LEN() returns the value to store in the cmsg_len
member of the cmsghdr structure, taking into account any padding
needed to satisfy alignment requirements.
One possible implementation could be:
#define CMSG_LEN(length) ( ALIGN(sizeof(struct cmsghdr)) + length )
Note the difference between CMSG_SPACE() and CMSG_LEN(), shown also
in the figure in Section 4.2: the former accounts for any required
padding at the end of the ancillary data object and the latter is the
actual length to store in the cmsg_len member of the ancillary data
object.
4.4. Summary of Options Described Using Ancillary Data
Stevens & Thomas [Page 22]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
There are six types of optional information described in this
document that are passed between the application and the kernel using
ancillary data:
1. the send/receive interface and source/destination address,
2. the hop limit,
3. next hop address,
4. Hop-by-Hop options,
5. Destination options, and
6. Routing header.
First, to receive any of this optional information (other than the
next hop address, which can only be set), the application must call
setsockopt() to turn on the corresponding flag:
int on = 1;
setsockopt(fd, IPPROTO_IPV6, IPV6_PKTINFO, &on, sizeof(on));
setsockopt(fd, IPPROTO_IPV6, IPV6_HOPLIMIT, &on, sizeof(on));
setsockopt(fd, IPPROTO_IPV6, IPV6_HOPOPTS, &on, sizeof(on));
setsockopt(fd, IPPROTO_IPV6, IPV6_DSTOPTS, &on, sizeof(on));
setsockopt(fd, IPPROTO_IPV6, IPV6_SRCRT, &on, sizeof(on));
When any of these options are enabled, the corresponding data is
returned as control information by recvmsg(), as one or more
ancillary data objects.
Nothing special need be done to send any of this optional
information; the application just calls sendmsg() and specifies one
or more ancillary data objects as control information.
We also summarize the three cmsghdr fields that describe the
ancillary data objects:
cmsg_level cmsg_type cmsg_data[] #times
------------ ------------ ------------------------ ------
IPPROTO_IPV6 IPV6_PKTINFO in6_pktinfo structure once
IPPROTO_IPV6 IPV6_HOPLIMIT int once
IPPROTO_IPV6 IPV6_NEXTHOP socket address structure once
IPPROTO_IPV6 IPV6_HOPOPTS implementation dependent mult.
IPPROTO_IPV6 IPV6_DSTOPTS implementation dependent mult.
IPPROTO_IPV6 IPV6_SRCRT implementation dependent once
The final column indicates how many times an ancillary data object of
that type can appear as control information. The Hop-by-Hop and
Destination options can appear multiple times, while all the others
can appear only one time.
Stevens & Thomas [Page 23]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
All these options are described in detail in following sections. All
the constants beginning with IPV6_ are defined as a result of
including the <netinet/in.h> header.
(Note: It is up to the implementation what it passes as ancillary
data for the Hop-by-Hop option, Destination option, and source route
option, since the API to these features is through a set of
inet6_option_XXX() and inet6_srcrt_XXX() functions that we define
later. These functions serve two purposes: to simplify the interface
to these features (instead of requiring the application to know the
intimate details of the extension header formats), and to hide the
actual implementation from the application. Nevertheless, we show
some examples of these features that store the actual extension
header as the ancillary data. Implementations need not use this
technique.)
4.5. TCP Access to Ancillary Data
The summary in the previous section assumes a UDP socket. Sending
and receiving ancillary data is easy with UDP: the application calls
sendmsg() and recvmsg() instead of sendto() and recvfrom().
But there might be cases where a TCP application wants to send or
receive this optional information. For example, a TCP client might
want to specify a source route and this needs to be done before
calling connect(). Similarly a TCP server might want to know the
received interface after accept() returns along with any Destination
options.
One new socket option is defined to allow TCP access to these
optional fields, although it is valid to use this with UDP or raw
sockets as well. Setting the socket option specifies any of the
optional output fields:
setsockopt(fd, IPPROTO_IPV6, IPV6_PKTOPTIONS, &buf, len);
The fourth argument points to a buffer containing one or more
ancillary data objects, and the fifth argument is the total length of
all these objects. The application fills in this buffer exactly as
if the buffer were being passed to sendmsg() as control information.
The corresponding receive option
getsockopt(fd, IPPROTO_IPV6, IPV6_PKTOPTIONS, &buf, &len);
returns a buffer with one or more ancillary data objects for all the
optional receive information that the application has previously
Stevens & Thomas [Page 24]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
specified that it wants to receive. The fourth argument points to
the buffer that is filled in by the call. The fifth argument is a
pointer to a value-result integer: when the function is called the
integer specifies the size of the buffer pointed to by the fourth
argument, and on return this integer contains the actual number of
bytes that were returned. The application processes this buffer
exactly as if the buffer were returned by recvmsg() as control
information.
When using getsockopt() with the IPV6_PKTOPTIONS option, only the
options from the most recently received segment are retained and
returned to the caller. Also, none of the ancillary data that we
describe in this document is ever returned as control information by
recvmsg() on a TCP socket.
The options set by calling setsockopt() for IPV6_PKTOPTIONS are
called "sticky" options because once set they apply to all packets
sent on that socket. They may, however, be overridden with ancillary
data specified in a call to sendmsg().
But the following three options are considered a set: Hop-by-Hop,
Destination, and Routing header options. If any of these three
options are specified in a call to sendmsg(), then none of these
three from the socket's sticky options are sent for this packet. For
example, if the application calls setsockopt() for IPV6_PKTOPTIONS
and sets sticky values for the Hop-by-Hop and Destination options,
but then calls sendmsg() specifying just a Routing header as an
ancillary data object, then only the Routing header is sent with this
packet. The two sticky options, Hop-by-Hop and Destination, are not
sent for this packet.
5. Packet Information
There are four pieces of information that an application can specify
for an outgoing packet using ancillary data:
1. the source IPv6 address,
2. the outgoing interface index,
3. the outgoing hop limit, and
4. the next hop address.
Three similar pieces of information can be returned for a received
packet as ancillary data:
1. the destination IPv6 address,
2. the arriving interface index, and
3. the arriving hop limit.
Stevens & Thomas [Page 25]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
The flow label can also be considered as packet information, but its
semantics differ from these three, so we describe it in Section 6.
The first two pieces of information are contained in an in6_pktinfo
structure that is sent as ancillary data with sendmsg() and received
as ancillary data with recvmsg(). This structure is defined as a
result of including the <netinet/in.h> header.
struct in6_pktinfo {
struct in6_addr ipi6_addr; /* src/dst IPv6 address */
int ipi6_ifindex; /* send/recv interface index */
};
In the cmsghdr structure containing this ancillary data, the
cmsg_level member will be IPPROTO_IPV6, the cmsg_type member will be
IPV6_PKTINFO, and the first byte of cmsg_data[] will be the first
byte of the in6_pktinfo structure.
This information is returned as ancillary data by recvmsg() only if
the application has enabled the IPV6_PKTINFO socket option:
int on = 1;
setsockopt(fd, IPPROTO_IPV6, IPV6_PKTINFO, &on, sizeof(on));
Nothing special need be done to send this information: just specify
the control information as ancillary data for sendmsg().
(Note: The hop limit is not contained in the in6_pktinfo structure
for the following reason. Some UDP servers want to respond to client
requests by sending their reply out the same interface on which the
request was received and with the source IPv6 address of the reply
equal to the destination IPv6 address of the request. To do this the
application can enable just the IPV6_PKTINFO socket option and then
use the received control information from recvmsg() as the outgoing
control information for sendmsg(). The application need not examine
or modify the in6_pktinfo structure at all. But if the hop limit
were contained in this structure, the application would have to parse
the received control information and change the hop limit member,
since the received hop limit is not the desired value for an outgoing
packet.)
5.1. Specifying/Receiving the Interface
Interfaces on an IPv6 node are identified by a small positive
integer, as described in Section 4 of [2]. That document also
describes a function to map an interface name to its interface index,
a function to map an interface index to its interface name, and a
Stevens & Thomas [Page 26]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
function to return all the interface names and indexes. Notice from
this document that no interface is ever assigned an index of 0.
When specifying the outgoing interface, if the ipi6_ifindex value is
0, the kernel will choose the outgoing interface. If the application
specifies an outgoing interface for a multicast packet, the interface
specified by the ancillary data overrides any interface specified by
the IPV6_MULTICAST_IF socket option (described in [2]), for that call
to sendmsg() only.
When the IPV6_PKTINFO socket option is enabled, the received
interface index is always returned as the ipi6_index member of the
in6_pktinfo structure.
5.2. Specifying/Receiving Source/Destination Address
The source IPv6 address can be specified by calling bind() before
each output operation, but supplying the source address together with
the data requires less overhead (i.e., fewer system calls) and
requires less state to be stored and protected in a multithreaded
application.
When specifying the source IPv6 address as ancillary data, if the
ipi6_addr member of the in6_pktinfo structure is the unspecified
address (IN6ADDR_ANY_INIT), then (a) if an address is currently bound
to the socket, it is used as the source address, or (b) if no address
is currently bound to the socket, the kernel will choose the source
address. If the ipi6_addr member is not the unspecified address, but
the socket has already bound a source address, then the ipi6_addr
value overrides the already-bound source address for this output
operation only.
When the in6_pktinfo structure is returned as ancillary data by
recvmsg(), the ipi6_addr member contains the destination IPv6 address
from the received packet.
5.3. Specifying/Receiving the Hop Limit
The outgoing hop limit is normally specified with either the
IPV6_UNICAST_HOPS socket option or the IPV6_MULTICAST_HOPS socket
option, both of which are described in [2]. Specifying the hop limit
as ancillary data lets the application override either the kernel's
default or a previously specified value, for either a unicast
destination or a multicast destination, for a single output
operation. Returning the received hop limit is useful for programs
such as Traceroute and for IPv6 applications that need to verify that
Stevens & Thomas [Page 27]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
the received hop limit is 255 (e.g., that the packet has not been
forwarded).
The received hop limit is returned as ancillary data by recvmsg()
only if the application has enabled the IPV6_HOPLIMIT socket option:
int on = 1;
setsockopt(fd, IPPROTO_IPV6, IPV6_HOPLIMIT, &on, sizeof(on));
In the cmsghdr structure containing this ancillary data, the
cmsg_level member will be IPPROTO_IPV6, the cmsg_type member will be
IPV6_HOPLIMIT, and the first byte of cmsg_data[] will be the first
byte of the integer hop limit.
Nothing special need be done to specify the outgoing hop limit: just
specify the control information as ancillary data for sendmsg(). As
specified in [2], the interpretation of the integer hop limit value
is
x < -1: return an error of EINVAL
x == -1: use kernel default
0 <= x <= 255: use x
x >= 256: return an error of EINVAL
5.4. Specifying the Next Hop Address
The IPV6_NEXTHOP ancillary data object specifies the next hop for the
datagram as a socket address structure. In the cmsghdr structure
containing this ancillary data, the cmsg_level member will be
IPPROTO_IPV6, the cmsg_type member will be IPV6_NEXTHOP, and the
first byte of cmsg_data[] will be the first byte of the socket
address structure.
This is a privileged option.
If the socket address structure contains an IPv6 address (e.g., the
sin6_family member is AF_INET6), then the node identified by that
address must be a neighbor of the sending host. If that address
equals the destination IPv6 address of the datagram, then this is
equivalent to the existing SO_DONTROUTE socket option.
5.5. Additional Errors with sendmsg()
With the IPV6_PKTINFO socket option there are no additional errors
possible with the call to recvmsg(). But when specifying the
Stevens & Thomas [Page 28]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
outgoing interface or the source address, additional errors are
possible from sendmsg():
ENXIO The interface specified by ipi6_ifindex does not exist.
ENETDOWN The interface specified by ipi6_ifindex is not enabled
for IPv6 use.
EADDRNOTAVAIL ipi6_ifindex specifies an interface but the address
ipi6_addr is not available for use on that interface.
EHOSTUNREACH No route to the destination exists over the interface
specified by ifi6_ifindex.
6. Flow Labels
IPv6 allows packets to be explicitly labeled as belonging to a flow
of related packets (Section 6 of [1]). All packets with a given IPv6
source address that share the same flow label must have the following
fields in common as well: destination address (unicast or multicast),
priority, Hop-by-Hop options header, and if a Routing header is
present, all extension headers up to and including the Routing
header. Flow label values must be uniformly distributed in the range
[1, 2^24-1] so that routers may use any portion of the flow label as
a hash key to access stored state for the flow.
The following points must be considered in designing an API to
specify flow labels.
- Space is already allocated in the sockaddr_in6 structure for the
flow label. This implies that the process specifies the value
(setting it to 0 to indicate no flow), in a call to connect() for
a connected socket, or in a call to sendto() or sendmsg() for an
unconnected socket. (Note: The sin6_flowinfo field performs
double duty, carrying both the outgoing flow and the incoming
flow. UDP applications that read requests using recvfrom() and
then send a reply using sendto() must not use the incoming flow
label for the outgoing reply.)
- Generation of flow labels should be in the kernel, since they must
be unique for a given source address, destination address and
priority. The kernel also must keep track of the assigned flow
labels to prevent them from being reused by a new flow within the
flow-state lifetime (6 seconds default).
- These first two points imply that the kernel assigns the flow
label, but the process needs a way to obtain its value from the
Stevens & Thomas [Page 29]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
kernel.
- To assign a flow label the process must specify the destination
address and priority. (Note: The use of the priority field in the
IPv6 header is still subject to change. The basic API spec [2]
removed all references to this field for this reason. Therefore
it is unspecified how a process specifies a nonzero priority
field.)
- All packets belonging to the same flow must also have the same
Hop-by-Hop header and, if a Routing header is present, all
extension headers up to and including the Routing header.
Therefore, when a process asks to have a flow label assigned, it
should also specify these extension headers that must remain
constant for the flow.
- For a connected socket (TCP or UDP) the process must be able
specify a flow label either when the connection is established (as
part of the sockaddr_in6 structure that is passed to connect()),
or after the connection is established (the kernel should notice
that the socket is already connected when it is asked to assign
the flow label, and then start using it for that socket). On
these connected sockets the process calls write() or send(), and
does not specify a sockaddr_in6 structure with the flow
label--hence the requirement that the kernel store the value and
automatically use it.
- For an unconnected UDP socket the process must ask the kernel to
assign the flow label, obtain the value, and then use that value
in subsequent calls to sendto() or sendmsg().
- It should be possible for a UDP application that will communicate
with N peer processes to assign up to N different flow labels to a
given socket. The process obtains the N values from the kernel
and then uses the correct one for each of the N peers.
- getpeername() can return the assigned flow label for a connected
socket, but this function cannot be used to return the flow label
for an unconnected socket.
- Flows are defined between a source and destination. It should be
possible for multiple sockets between a given source and
destination to share the same flow label. This implies that it
must be possible for a flow label assigned to one socket to be
"reused" to another socket.
One way a TCP client could do this, for example, is to obtain a
flow to a given destination and then simply use that flow label in
Stevens & Thomas [Page 30]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
the socket address structures for multiple connect()s to the same
server (e.g., Web clients). But it should also be possible to use
some already assigned flow on an already connected socket,
implying some way to tell the kernel to use an already assigned
flow on a given socket.
- There is some error checking that the kernel could perform with
regard to flow labels, and the API should not address these, but
leave them up to the implementation. For example, what if the
process asks the kernel to allocate a flow label to DST1 for
SOCKFD1 but then calls connect(SOCKFD1) connecting to DST2 using
the flow label that was assigned to DST1? Or when a UDP
application allocates multiple flow labels, but uses them
incorrectly? Or when a UDP application allocates a flow to a
destination, but then sends datagrams with the flow label set to
0?
- Flow labels are often mentioned along with RSVP, but the
interaction between RSVP reservations and IPv6 flow labels is
unclear (Section 1.2 of [5]). We note that RSVP is receiver-
driven, while IPv6 flows labels must be chosen by the sender.
- Lastly, the use of flow labels is still experimental. All this
API can provide is some way to allocate flow labels within the
rules provided in [1], allowing the kernel to enforce the
requirements on common packet fields and freeing the application
from the burden of selecting unique pseudo-random flow labels.
The interface to the flow label feature is through three
inet6_flow_XXX() functions. The function prototypes for these
functions are all in the <netinet/in.h> header.
6.1. inet6_flow_assign
int inet6_flow_assign(int fd, struct sockaddr_in6 *sin6,
const void *buf, size_t len);
To cause a flow label to be assigned the application must specify the
socket, destination address, priority, and the optional headers that
are not allowed to change for the flow.
The socket address structure pointed to by sin6 specifies the
destination address and priority. The flow label and port number
fields are ignored.
The buffer specified by the buf and len arguments contains the Hop-
Stevens & Thomas [Page 31]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
by-Hop options, the Destination options that precede the option
Routing header, and the optional Routing header. The format of the
buffer is a sequence of ancillary data objects, as described with the
IPV6_PKTOPTIONS socket option.
The flow label is assigned and returned in the sin6_flowinfo member
of the socket address structure.
This function returns 0 on success, -1 on error.
If an earlier connect() or accept() has already connected the socket
to the destination address supplied in this call, then subsequent
output operations will have the assigned flow label in the IPv6
header.
If the socket is not connected then the application must use the
returned flow label in a subsequent call to connect(), sendto(), or
sendmsg().
(Note: It makes no sense to assign a flow to a listening TCP socket,
since a destination address is required to assign the flow.) (Note:
Since the socket address structure pointed to by the second argument
is both a value and a result, implementations might consider using
ioctl() for flow label access. Note that if this function were
implemented using setsockopt() followed by getsockopt(), it would not
be thread safe.)
6.2. inet6_flow_free
int inet6_flow_free(int fd, const struct sockaddr_in6 *sin6);
A previously assigned flow label can be explicitly freed. If this
function is not called, the flow label is automatically freed on the
last close of the socket.
The flow label field in the socket address structure specifies the
flow label that is being freed.
This function returns 0 on success, -1 on error.
6.3. inet6_flow_reuse
int inet6_flow_reuse(int currfd, int newfd,
const struct sockaddr_in6 *sin6);
Stevens & Thomas [Page 32]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
A flow label assigned to one socket can be used on another socket
(subject to the basic limitations of flow labels, of course, such as
packets belonging to the flow from both sockets having the same
destination address, etc.). This function needs to be called only if
the new socket is already connected. If the new socket is not
already connected, the application can just specify the known flow
label in a call to connect(), sendto(), or sendmsg().
This function specifies that the flow label previously assigned to
the socket currfd is also to be used on the socket newfd.
The caller must fill in the destination address, priority, and flow
label fields of the socket address structure.
If the socket newfd is already connected to the destination address,
subsequent output operations will have the assigned flow label in the
IPv6 header.
This function returns 0 on success, -1 on error.
7. Hop-By-Hop Options
A variable number of Hop-by-Hop options can appear in a single Hop-
by-Hop options header. Each option in the header is TLV-encoded with
a type, length, and value.
Today only three Hop-by-Hop options are defined for IPv6 [1]: Jumbo
Payload, Pad1, and PadN, although a proposal exists for a router-
alert Hop-by-Hop option. The Jumbo Payload option should not be
passed back to an application and an application should receive an
error if it attempts to set it. This option is processed entirely by
the kernel. It is indirectly specified by datagram-based
applications as the size of the datagram to send and indirectly
passed back to these applications as the length of the received
datagram. The two pad options are for alignment purposes and are
automatically inserted by a sending kernel when needed and ignored by
the receiving kernel. This section of the API is therefore defined
for future Hop-by-Hop options that an application may need to specify
and receive.
Individual Hop-by-Hop options (and Destination options, which are
described shortly, and which are similar to the Hop-by-Hop options)
may have specific alignment requirements. For example, the 4-byte
Jumbo Payload length should appear on a 4-byte boundary, and IPv6
addresses are normally aligned on an 8-byte boundary. These
requirements and the terminology used with these options are
discussed in Section 4.2 and Appendix A of [1]. The alignment of
Stevens & Thomas [Page 33]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
each option is specified by two values, called x and y, written as
"xn + y". This states that the option must appear at an integer
multiple of x bytes from the beginning of the options header (x can
have the values 1, 2, 4, or 8), plus y bytes (y can have a value
between 0 and 7, inclusive). The Pad1 and PadN options are inserted
as needed to maintain the required alignment. Whatever code builds
either a Hop-by-Hop options header or a Destination options header
must know the values of x and y for each option.
Multiple Hop-by-Hop options can be specified by the application.
Normally one ancillary data object describes all the Hop-by-Hop
options (since each option is itself TLV-encoded) but the application
can specify multiple ancillary data objects for the Hop-by-Hop
options, each object specifying one or more options. Care must be
taken designing the API for these options since
1. it may be possible for some future Hop-by-Hop options to be
generated by the application and processed entirely by the
application (e.g., the kernel may not know the alignment
restrictions for the option),
2. it must be possible for the kernel to insert its own Hop-by-Hop
options in an outgoing packet (e.g., the Jumbo Payload option),
3. the application can place one or more Hop-by-Hop options into a
single ancillary data object,
3. if the application specifies multiple ancillary data objects,
each containing one or more Hop-by-Hop options, the kernel must
combine these a single Hop-by-Hop options header, and
4. it must be possible for the kernel to remove some Hop-by-Hop
options from a received packet before returning the remaining
Hop-by-Hop options to the application. (This removal might
consist of the kernel converting the option into a pad option of
the same length.)
Finally, we note that access to some Hop-by-Hop options or to some
Destination options, might require special privilege. That is,
normal applications (without special privilege) might be forbidden
from setting certain options in outgoing packets, and might never see
certain options in received packets.
7.1. Receiving Hop-by-Hop Options
To receive Hop-by-Hop options the application must enable the
IPV6_HOPOPTS socket option:
Stevens & Thomas [Page 34]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
int on = 1;
setsockopt(fd, IPPROTO_IPV6, IPV6_HOPOPTS, &on, sizeof(on));
All the Hop-by-Hop options are returned as one ancillary data object
described by a cmsghdr structure. The cmsg_level member will be
IPPROTO_IPV6 and the cmsg_type member will be IPV6_HOPOPTS. These
options are then processed by calling the inet6_option_next() and
inet6_option_find() functions, described shortly.
7.2. Sending Hop-by-Hop Options
To send one or more Hop-by-Hop options, the application just
specifies them as ancillary data in a call to sendmsg(). No socket
option need be set.
Normally all the Hop-by-Hop options are specified by a single
ancillary data object. Multiple ancillary data objects, each
containing one or more Hop-by-Hop options, can also be specified, in
which case the kernel will combine all the Hop-by-Hop options into a
single Hop-by-Hop extension header. But it should be more efficient
to use a single ancillary data object to describe all the Hop-by-Hop
options. The cmsg_level member is set to IPPROTO_IPV6 and the
cmsg_type member is set to IPV6_HOPOPTS. The option is normally
constructed using the inet6_option_init(), inet6_option_append(), and
inet6_option_alloc() functions, described shortly.
Additional errors may be possible from sendmsg() if the specified
option is in error.
7.3. Hop-by-Hop and Destination Options Processing
Building and parsing the Hop-by-Hop and Destination options is
complicated for the reasons given earlier. We therefore define a set
of functions to help the application. The function prototypes for
these functions are all in the <netinet/in.h> header.
7.3.1. inet6_option_space
int inet6_option_space(int nbytes);
This function returns the number of bytes required to hold an option
when it is stored as ancillary data, including the cmsghdr structure
at the beginning, and any padding at the end (to make its size a
Stevens & Thomas [Page 35]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
multiple of 8 bytes). The argument is the size of the structure
defining the option, which must include any pad bytes at the
beginning (the value y in the alignment term "xn + y"), the type
byte, the length byte, and the option data.
(Note: If multiple options are stored in a single ancillary data
object, which is the recommended technique, this function
overestimates the amount of space required by the size of N-1 cmsghdr
structures, where N is the number of options to be stored in the
object. This is of little consequence, since it is assumed that most
Hop-by-Hop option headers and Destination option headers carry only
one option (p. 33 of [1]).)
7.3.2. inet6_option_init
int inet6_option_init(void *bp, struct cmsghdr **cmsgp, int type);
This function is called once per ancillary data object that will
contain either Hop-by-Hop or Destination options. It returns 0 on
success or -1 on an error.
bp is a pointer to previously allocated space that will contain the
ancillary data object. It must be large enough to contain all the
individual options to be added by later calls to
inet6_option_append() and inet6_option_alloc().
cmsgp is a pointer to a pointer to a cmsghdr structure. *cmsgp is
initialized by this function to point to the cmsghdr structure
constructed by this function in the buffer pointed to by bp.
type is either IPV6_HOPOPTS or IPV6_DSTOPTS. This type is stored in
the cmsg_type member of the cmsghdr structure pointed to by *cmsgp.
7.3.3. inet6_option_append
int inet6_option_append(struct cmsghdr *cmsg, const u_int8_t *typep,
int multx, int plusy);
This function appends a Hop-by-Hop option or a Destination option
into an ancillary data object that has been initialized by
inet6_option_init(). This function returns 0 if it succeeds or -1 on
an error.
cmsg is a pointer to the cmsghdr structure that must have been
Stevens & Thomas [Page 36]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
initialized by inet6_option_init().
typep is a pointer to the 8-bit option type. It is assumed that this
field is immediately followed by the 8-bit option data length field,
which is then followed immediately by the option data. The caller
initializes these three fields (the type-length-value, or TLV) before
calling this function.
The option type must have a value from 2 to 255, inclusive. (0 and 1
are reserved for the Pad1 and PadN options, respectively.)
The option data length must have a value between 0 and 255,
inclusive, and is the length of the option data that follows.
multx is the value x in the alignment term "xn + y" described
earlier. It must have a value of 1, 2, 4, or 8.
plusy is the value y in the alignment term "xn + y" described
earlier. It must have a value between 0 and 7, inclusive.
7.3.4. inet6_option_alloc
u_int8_t *inet6_option_alloc(struct cmsghdr *cmsg, int datalen,
int multx, int plusy);
This function appends a Hop-by-Hop option or a Destination option
into an ancillary data object that has been initialized by
inet6_option_init(). This function returns a pointer to the 8-bit
option type field that starts the option on success, or NULL on an
error.
The difference between this function and inet6_option_append() is
that the latter copies the contents of a previously built option into
the ancillary data object while the current function returns a
pointer to the space in the data object where the option's TLV must
then be built by the caller.
cmsg is a pointer to the cmsghdr structure that must have been
initialized by inet6_option_init().
datalen is the value of the option data length byte for this option.
This value is required as an argument to allow the function to
determine if padding must be appended at the end of the option. (The
inet6_option_append() function does not need a data length argument
since the option data length must already be stored by the caller.)
Stevens & Thomas [Page 37]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
multx is the value x in the alignment term "xn + y" described
earlier. It must have a value of 1, 2, 4, or 8.
plusy is the value y in the alignment term "xn + y" described
earlier. It must have a value between 0 and 7, inclusive.
7.3.5. inet6_option_next
int inet6_option_next(const struct cmsghdr *cmsg, u_int8_t **tptrp);
This function processes the next Hop-by-Hop option or Destination
option in an ancillary data object. If another option remains to be
processed, the return value of the function is 0 and *tptrp points to
the 8-bit option type field (which is followed by the 8-bit option
data length, followed by the option data). If no more options remain
to be processed, the return value is -1 and *tptrp is NULL. If an
error occurs, the return value is -1 and *tptrp is not NULL.
cmsg is a pointer to cmsghdr structure of which cmsg_level equals
IPPROTO_IPV6 and cmsg_type equals either IPV6_HOPOPTS or
IPV6_DSTOPTS.
tptrp is a pointer to a pointer to an 8-bit byte and *tptrp is used
by the function to remember its place in the ancillary data object
each time the function is called. The first time this function is
called for a given ancillary data object, *tptrp must be set to NULL.
Each time this function returns success, *tptrp points to the 8-bit
option type field for the next option to be processed.
7.3.6. inet6_option_find
int inet6_option_find(const struct cmsghdr *cmsg, u_int8_t *tptrp,
int type);
This function is similar to the previously described
inet6_option_next() function, except this function lets the caller
specify the option type to be searched for, instead of always
returning the next option in the ancillary data object.
cmsg is a pointer to cmsghdr structure of which cmsg_level equals
IPPROTO_IPV6 and cmsg_type equals either IPV6_HOPOPTS or
IPV6_DSTOPTS.
tptrp is a pointer to a pointer to an 8-bit byte and *tptrp is used
Stevens & Thomas [Page 38]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
by the function to remember its place in the ancillary data object
each time the function is called. The first time this function is
called for a given ancillary data object, *tptrp must be set to NULL.
This function starts searching for an option of the specified type
beginning after the value of *tptrp. If an option of the specified
type is located, this function returns 0 and *tptrp points to the
8-bit option type field for the option of the specified type. If an
option of the specified type is not located, the return value is -1
and *tptrp is NULL. If an error occurs, the return value is -1 and
*tptrp is not NULL.
7.3.7. Options Examples
We now provide an example that builds two Hop-by-Hop options. First
we define two options, called X and Y, taken from the example in
Appendix A of [1]. We assume that all options will have structure
definitions similar to what is shown below.
/* option X and option Y are defined in [1], pp. 33-34 */
#define IPV6_OPT_X_TYPE X /* replace X with assigned value */
#define IPV6_OPT_X_LEN 12
#define IPV6_OPT_X_MULTX 8 /* 8n + 2 alignment */
#define IPV6_OPT_X_OFFSETY 2
struct ipv6_opt_X {
u_int8_t opt_X_pad[IPV6_OPT_X_OFFSETY];
u_int8_t opt_X_type;
u_int8_t opt_X_len;
u_int32_t opt_X_val1;
u_int64_t opt_X_val2;
};
#define IPV6_OPT_Y_TYPE Y /* replace Y with assigned value */
#define IPV6_OPT_Y_LEN 7
#define IPV6_OPT_Y_MULTX 4 /* 4n + 3 alignment */
#define IPV6_OPT_Y_OFFSETY 3
struct ipv6_opt_Y {
u_int8_t opt_Y_pad[IPV6_OPT_Y_OFFSETY];
u_int8_t opt_Y_type;
u_int8_t opt_Y_len;
u_int8_t opt_Y_val1;
u_int16_t opt_Y_val2;
u_int32_t opt_Y_val3;
};
Stevens & Thomas [Page 39]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
We now show the code fragment to build one ancillary data object
containing both options.
struct msghdr msg;
struct cmsghdr *cmsgptr;
struct ipv6_opt_X optX;
struct ipv6_opt_Y optY;
msg.msg_control = malloc(sizeof(optX) + sizeof(optY));
inet6_option_init(msg.msg_control, &cmsgptr, IPV6_HOPOPTS);
optX.opt_X_type = IPV6_OPT_X_TYPE;
optX.opt_X_len = IPV6_OPT_X_LEN;
optX.opt_X_val1 = <32-bit value>;
optX.opt_X_val2 = <64-bit value>;
inet6_option_append(cmsgptr, &optX.opt_X_type,
IPV6_OPT_X_MULTX, IPV6_OPT_X_OFFSETY);
optY.opt_Y_type = IPV6_OPT_Y_TYPE;
optY.opt_Y_len = IPV6_OPT_Y_LEN;
optY.opt_Y_val1 = <8-bit value>;
optY.opt_Y_val2 = <16-bit value>;
optY.opt_Y_val3 = <32-bit value>;
inet6_option_append(cmsgptr, &optY.opt_Y_type,
IPV6_OPT_Y_MULTX, IPV6_OPT_Y_OFFSETY);
msg.msg_controllen = CMSG_SPACE(cmsgptr->cmsg_len);
The call to inet6_option_init() builds the cmsghdr structure in the
control buffer.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_len = CMSG_LEN(0) = 12 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_level = IPPROTO_IPV6 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_type = IPV6_HOPOPTS |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Here we assume a 32-bit architecture where sizeof(struct cmsghdr)
equals 12, with a desired alignment of 4-byte boundaries (that is,
the ALIGN() macro shown in the sample implementations of the
CMSG_xxx() functions rounds up to a multiple of 4).
The first call to inet6_option_append() appends the X option. Since
this is the first option in the ancillary data object, 2 bytes are
Stevens & Thomas [Page 40]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
allocated for the Next Header byte and for the Hdr Ext Len byte. The
former will be set by the kernel, depending on the type of header
that follows this header, and the latter byte is set to 1. These 2
bytes form the 2 bytes of padding (IPV6_OPT_X_OFFSETY) required at
the beginning of this option.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_len = 28 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_level = IPPROTO_IPV6 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_type = IPV6_HOPOPTS |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Next Header | Hdr Ext Len=1 | Option Type=X |Opt Data Len=12|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 4-octet field |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ 8-octet field +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The cmsg_len member of the cmsghdr structure is incremented by 16,
the size of the option.
The next call to inet6_option_append() appends the Y option to the
ancillary data object.
Stevens & Thomas [Page 41]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_len = 44 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_level = IPPROTO_IPV6 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_type = IPV6_HOPOPTS |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Next Header | Hdr Ext Len=3 | Option Type=X |Opt Data Len=12|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 4-octet field |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ 8-octet field +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| PadN Option=1 |Opt Data Len=1 | 0 | Option Type=Y |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Opt Data Len=7 | 1-octet field | 2-octet field |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 4-octet field |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| PadN Option=1 |Opt Data Len=2 | 0 | 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
16 bytes are appended by this function, so cmsg_len becomes 44. The
inet6_option_append() function notices that the appended data
requires 4 bytes of padding at the end, to make the size of the
ancillary data object a multiple of 8, and appends the PadN option
before returning. The Hdr Ext Len byte is incremented by 2 to become
3.
Alternately, the application could build two ancillary data objects,
one per option, although this will probably be less efficient than
combining the two options into a single ancillary data object (as
just shown). The kernel must combine these into a single Hop-by-Hop
extension header in the final IPv6 packet.
Stevens & Thomas [Page 42]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
struct msghdr msg;
struct cmsghdr *cmsgptr;
struct ipv6_opt_X optX;
struct ipv6_opt_Y optY;
msg.msg_control = malloc(sizeof(optX) + sizeof(optY));
inet6_option_init(msg.msg_control, &cmsgptr, IPPROTO_HOPOPTS);
optX.opt_X_type = IPV6_OPT_X_TYPE;
optX.opt_X_len = IPV6_OPT_X_LEN;
optX.opt_X_val1 = <32-bit value>;
optX.opt_X_val2 = <64-bit value>;
inet6_option_append(cmsgptr, &optX.opt_X_type,
IPV6_OPT_X_MULTX, IPV6_OPT_X_OFFSETY);
msg.msg_controllen = CMSG_SPACE(cmsgptr->cmsg_len);
inet6_option_init((u_char *)msg.msg_control + msg.msg_controllen,
&cmsgptr, IPPROTO_HOPOPTS);
optY.opt_Y_type = IPV6_OPT_Y_TYPE;
optY.opt_Y_len = IPV6_OPT_Y_LEN;
optY.opt_Y_val1 = <8-bit value>;
optY.opt_Y_val2 = <16-bit value>;
optY.opt_Y_val3 = <32-bit value>;
inet6_option_append(cmsgptr, &optY.opt_Y_type,
IPV6_OPT_Y_MULTX, IPV6_OPT_Y_OFFSETY);
msg.msg_controllen += CMSG_SPACE(cmsgptr->cmsg_len);
Each call to inet6_option_init() builds a new cmsghdr structure, and
the final result looks like the following:
Stevens & Thomas [Page 43]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_len = 28 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_level = IPPROTO_IPV6 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_type = IPV6_HOPOPTS |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Next Header | Hdr Ext Len=1 | Option Type=X |Opt Data Len=12|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 4-octet field |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ 8-octet field +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_len = 28 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_level = IPPROTO_IPV6 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_type = IPV6_HOPOPTS |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Next Header | Hdr Ext Len=1 | Pad1 Option=0 | Option Type=Y |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Opt Data Len=7 | 1-octet field | 2-octet field |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 4-octet field |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| PadN Option=1 |Opt Data Len=2 | 0 | 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
When the kernel combines these two options into a single Hop-by-Hop
extension header, the first 3 bytes of the second ancillary data
object (the Next Header byte, the Hdr Ext Len byte, and the Pad1
option) will be combined into a PadN option occupying 3 bytes.
The following code fragment is a redo of the first example shown
(building two options in a single ancillary data object) but this
time we use inet6_option_alloc().
u_int8_t *typep;
struct msghdr msg;
struct cmsghdr *cmsgptr;
struct ipv6_opt_X *optXp; /* now a pointer, not a struct */
struct ipv6_opt_Y *optYp; /* now a pointer, not a struct */
msg.msg_control = malloc(sizeof(*optXp) + sizeof(*optYp));
Stevens & Thomas [Page 44]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
inet6_option_init(msg.msg_control, &cmsgptr, IPV6_HOPOPTS);
typep = inet6_option_append(cmsgptr, IPV6_OPT_X_LEN,
IPV6_OPT_X_MULTX, IPV6_OPT_X_OFFSETY);
optXp = (struct ipv6_opt_X *) (typep - IPV6_OPT_X_OFFSETY);
optXp->opt_X_type = IPV6_OPT_X_TYPE;
optXp->opt_X_len = IPV6_OPT_X_LEN;
optXp->opt_X_val1 = <32-bit value>;
optXp->opt_X_val2 = <64-bit value>;
typep = inet6_option_append(cmsgptr, IPV6_OPT_Y_LEN,
IPV6_OPT_Y_MULTX, IPV6_OPT_Y_OFFSETY);
optYp = (struct ipv6_opt_Y *) (typep - IPV6_OPT_Y_OFFSETY);
optYp->opt_Y_type = IPV6_OPT_Y_TYPE;
optYp->opt_Y_len = IPV6_OPT_Y_LEN;
optYp->opt_Y_val1 = <8-bit value>;
optYp->opt_Y_val2 = <16-bit value>;
optYp->opt_Y_val3 = <32-bit value>;
msg.msg_controllen = CMSG_SPACE(cmsgptr->cmsg_len);
Notice that inet6_option_alloc() returns a pointer to the 8-bit
option type field. If the program wants a pointer to an option
structure that includes the padding at the front (as shown in our
definitions of the ipv6_opt_X and ipv6_opt_Y structures), the y-
offset at the beginning of the structure must be subtracted from the
returned pointer.
The following code fragment shows the processing of Hop-by-Hop
options using the inet6_option_next() function.
struct msghdr msg;
struct cmsghdr *cmsgptr;
/* fill in msg */
/* call recvmsg() */
for (cmsgptr = CMSG_FIRSTHDR(&msg); cmsgptr != NULL;
cmsgptr = CMSG_NXTHDR(&msg, cmsgptr)) {
if (cmsgptr->cmsg_level == IPPROTO_IPV6 &&
cmsgptr->cmsg_type == IPV6_HOPOPTS) {
u_int8_t *tptr = NULL;
while (inet6_option_next(cmsgptr, &tptr) == 0) {
if (*tptr == IPV6_OPT_X_TYPE) {
struct ipv6_opt_X *optXp;
Stevens & Thomas [Page 45]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
optXp = (struct ipv6_opt_X *) (tptr - IPV6_OPT_X_OFFSETY);
<do whatever with> optXp->opt_X_val1;
<do whatever with> optXp->opt_X_val2;
} else if (*tptr == IPV6_OPT_Y_TYPE) {
struct ipv6_opt_Y *optYp;
optYp = (struct ipv6_opt_Y *) (tptr - IPV6_OPT_Y_OFFSETY);
<do whatever with> optYp->opt_Y_val1;
<do whatever with> optYp->opt_Y_val2;
<do whatever with> optYp->opt_Y_val3;
}
}
if (tptr != NULL)
<error encountered by inet6_option_next()>;
}
}
8. Destination Options
A variable number of Destination options can appear in one or more
Destination option headers. As defined in [1], a Destination options
header appearing before a Routing header is processed by the first
destination plus any subsequent destinations specified in the Routing
header, while a Destination options header appearing after a Routing
header is processed only by the final destination. As with the Hop-
by-Hop options, each option in a Destination options header is TLV-
encoded with a type, length, and value.
Today no Destination options are defined for IPv6 [1], although
proposals exist to use Destination options with mobility and
anycasting.
8.1. Receiving Destination Options
To receive Destination options the application must enable the
IPV6_DSTOPTS socket option:
int on = 1;
setsockopt(fd, IPPROTO_IPV6, IPV6_DSTOPTS, &on, sizeof(on));
All the Destination options appearing before a Routing header are
returned as one ancillary data object described by a cmsghdr
structure and all the Destination options appearing after a Routing
header are returned as another ancillary data object described by a
Stevens & Thomas [Page 46]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
cmsghdr structure. For these ancillary data objects, the cmsg_level
member will be IPPROTO_IPV6 and the cmsg_type member will be
IPV6_HOPOPTS. These options are then processed by calling the
inet6_option_next() and inet6_option_find() functions.
8.2. Sending Destination Options
To send one or more Destination options, the application just
specifies them as ancillary data in a call to sendmsg(). No socket
option need be set.
As described earlier, one set of Destination options can appear
before a Routing header, and one set can appear after a Routing
header. Each set can consist of one or more options.
Normally all the Destination options in a set are specified by a
single ancillary data object, since each option is itself TLV-
encoded. Multiple ancillary data objects, each containing one or
more Destination options, can also be specified, in which case the
kernel will combine all the Destination options in the set into a
single Destination extension header. But it should be more efficient
to use a single ancillary data object to describe all the Destination
options in a set. The cmsg_level member is set to IPPROTO_IPV6 and
the cmsg_type member is set to IPV6_DSTOPTS. The option is normally
constructed using the inet6_option_init(), inet6_option_append(), and
inet6_option_alloc() functions.
Additional errors may be possible from sendmsg() if the specified
option is in error.
9. Source Route Option
Source routing in IPv6 is accomplished by specifying a Routing header
as an extension header. There can be different types of Routing
headers, but IPv6 currently defines only the Type 0 Routing header
[1]. This type supports up to 23 intermediate nodes. With this
maximum number of intermediate nodes, a source, and a destination,
there are 24 hops, each of which is defined as a strict or loose hop.
Source routing with IPv4 sockets API (the IP_OPTIONS socket option)
requires the application to build the source route in the format that
appears as the IPv4 header option, requiring intimate knowledge of
the IPv4 options format. This IPv6 API, however, defines eight
functions that the application calls to build and examine a Routing
header. Four functions build a Routing header:
Stevens & Thomas [Page 47]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
inet6_srcrt_space() - return #bytes required for ancillary data
inet6_srcrt_init() - initialize ancillary data for Routing header
inet6_srcrt_add() - add IPv6 address & flags to Routing header
inet6_srcrt_lasthop() - specify the flags for the final hop
Four functions deal with a returned Routing header:
inet6_srcrt_reverse() - reverse a Routing header
inet6_srcrt_segments() - return #segments in a Routing header
inet6_srcrt_getaddr() - fetch one address from a Routing header
inet6_srcrt_getflags() - fetch one flag from a Routing header
The function prototypes for these functions are all in the
<netinet/in.h> header.
A Routing header is passed between the application and the kernel as
an ancillary data object. The cmsg_level member has a value of
IPPROTO_IPV6 and the cmsg_type member has a value of IPV6_SRCRT. The
contents of the cmsg_data[] member is implementation dependent and
should not be accessed directly by the application, but should be
accessed using the eight functions that we are about to describe.
The following constants are defined in the <netinet/in.h> header:
#define IPV6_SRCRT_LOOSE 0 /* this hop need not be a neighbor */
#define IPV6_SRCRT_STRICT 1 /* this hop must be a neighbor */
#define IPV6_SRCRT_TYPE_0 0 /* IPv6 Routing header type 0 */
When a Routing header is specified, the destination address specified
for connect(), sendto(), or sendmsg() is the final destination
address of the datagram. The Routing header then contains the
addresses of all the intermediate nodes.
9.1. inet6_srcrt_space
size_t inet6_srcrt_space(int type, int segments);
This function returns the number of bytes required to hold a Routing
header of the specified type containing the specified number of
segments (addresses). The number of segments must be between 1 and
23, inclusive. The return value includes the size of the cmsghdr
structure that precedes the Routing header, and any required padding.
If the return value is 0, then either the type of the Routing header
Stevens & Thomas [Page 48]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
is not supported by this implementation or the number of segments is
invalid for this type of Routing header.
(Note: This function returns the size but does not allocate the space
required for the ancillary data. This allows an application to
allocate a larger buffer, if other ancillary data objects are
desired, since all the ancillary data objects must be specified to
sendmsg() as a single msg_control buffer.)
9.2. inet6_srcrt_init
struct cmsghdr *inet6_srcrt_init(void *bp, int type);
This function initializes the buffer pointed to by bp to contain a
cmsghdr structure followed by a Routing header of the specified type.
The cmsg_len member of the cmsghdr structure is initialized to the
size of the structure plus the amount of space required by the
Routing header. The cmsg_level and cmsg_type members are also
initialized as required.
The caller must allocate the buffer and its size can be determined by
calling inet6_srcrt_space().
The return value is the pointer to the cmsghdr structure, and this is
then used as the first argument to the next two functions. If the
type of Routing header is not supported by the implementation, the
return value is NULL.
9.3. inet6_srcrt_add
int inet6_srcrt_add(struct cmsghdr *cmsg,
const struct in6_addr *addr, unsigned int flags);
This function adds the address pointed to by addr to the end of the
Routing header being constructed and sets the type of this hop to the
value of flags. For an IPv6 Type 0 Routing header, flags must be
either IPV6_SRCRT_LOOSE or IPV6_SRCRT_STRICT.
If successful, the cmsg_len member of the cmsghdr structure is
updated to account for the new address in the Routing header and the
return value of the function is 0.
If the address would exceed the limits of the Routing header, the
return value of the function is ENOSPC. If flags specifies an
Stevens & Thomas [Page 49]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
invalid value for the Routing header, the return value of the
function is EINVAL.
9.4. inet6_srcrt_lasthop
int inet6_srcrt_lasthop(struct cmsghdr *cmsg,
unsigned int flags);
This function specifies the Strict/Loose flag for the final hop of a
source route. For an IPv6 Type 0 Routing header, flags must be
either IPV6_SRCRT_LOOSE or IPV6_SRCRT_STRICT.
Notice that a source route that specifies N intermediate nodes
requires N+1 Strict/Loose flags. This requires N calls to
inet6_srcrt_add() followed by one call to inet6_srcrt_lasthop().
9.5. inet6_srcrt_reverse
int inet6_srcrt_reverse(const struct cmsghdr *in, struct cmsghdr *out);
This function takes a Routing header that was received as ancillary
data (pointed to by the first argument) and writes a new Routing
header that sends datagrams along the reverse of that route. Both
arguments are allowed to point to the same buffer (that is, the
reversal can occur in place). The return value of the function is 0
on success.
If the type of Routing header in not supported by the implementation,
the return value of the function is EOPNOTSUPP. If the Routing
header information is invalid, the return value of the function is
EINVAL.
9.6. inet6_srcrt_segments
int inet6_srcrt_segments(const struct cmsghdr *cmsg)
This function returns the number of segments (addresses) contained in
the Routing header described by cmsg. On success the return value is
between 1 and 23, inclusive. The return value is -1 if the cmsghdr
structure does not describe a valid Routing header or is a Routing
Stevens & Thomas [Page 50]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
header of an unsupported type.
9.7. inet6_srcrt_getaddr
struct in6_addr *inet6_srcrt_getaddr(struct cmsghdr *cmsg, int index);
This function returns a pointer to the IPv6 address specified by
index (which must have a value between 1 and the value returned by
inet6_srcrt_segments()) in the Routing header described by cmsg. An
application should first call inet6_srcrt_segments() to obtain the
number of segments in the Routing header.
If offset refers to an address beyond the end of the Routing header,
the return value is NULL.
9.8. inet6_srcrt_getflags
int inet6_srcrt_getflags(const struct cmsghdr *cmsg, int offset);
This function returns the flags value indexed by offset (which must
have a value between 0 and the value returned by
inet6_srcrt_segments()) in the Routing header described by cmsg. For
an IPv6 Type 0 Routing header the return value will be either
IPV6_SRCRT_LOOSE or IPV6_SRCRT_STRICT.
If offset refers to a segment beyond the end of the Routing header,
the return value is -1.
(Note: Addresses are indexed starting at 1, and flags starting at 0,
to maintain consistency with the terminology and figures in [1].)
9.9. Source Route Example
As an example of these source routing functions, we go through the
function calls for the example on p. 18 of [1]. The source is S, the
destination is D, and the three intermediate nodes are I1, I2, and
I3. f0, f1, f2, and f3 are the Strict/Loose flags for each hop.
Stevens & Thomas [Page 51]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
f0 f1 f2 f3
S -----> I1 -----> I2 -----> I3 -----> D
src: * S S S S S
dst: D I1 I2 I3 D D
A[1]: I1 I2 I1 I1 I1 I1
A[2]: I2 I3 I3 I2 I2 I2
A[3]: I3 D D D I3 I3
#seg: 3 3 2 1 0 3
check: f0 f1 f2 f3
src and dst are the source and destination IPv6 addresses in the IPv6
header. A[1], A[2], and A[3] are the three addresses in the Routing
header. #seg is the Segments Left field in the Routing header.
check indicates which bit of the Strict/Loose Bit Map (0 through 3,
specified as f0 through f3) that node checks.
The six values in the column beneath node S are the values in the
Routing header specified by the application using sendmsg(). The
function calls by the sender would look like:
void *ptr;
struct msghdr msg;
struct cmsghdr *cmsgptr;
struct sockaddr_in6 I1, I2, I3, D;
unsigned int f0, f1, f2, f3;
ptr = malloc(inet6_srcrt_space(IPV6_SRCRT_TYPE_0, 3));
cmsgptr = inet6_srcrt_init(ptr, IPV6_SRCRT_TYPE_0);
inet6_srcrt_add(cmsgptr, &I1.sin6_addr, f0);
inet6_srcrt_add(cmsgptr, &I2.sin6_addr, f1);
inet6_srcrt_add(cmsgptr, &I3.sin6_addr, f2);
inet6_srcrt_lasthop(cmsgptr, f3);
msg.msg_control = ptr;
msg.msg_controllen = CMSG_LEN(cmsgptr->cmsg_len);
/* finish filling in msg{}, msg_name = D */
/* call sendmsg() */
We also assume that the source address for the socket is not
specified (i.e., the asterisk in the figure).
The four columns of six values that are then shown between the five
nodes are the values of the fields in the packet while the packet is
Stevens & Thomas [Page 52]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
in transit between the two nodes. Notice that before the packet is
sent by the source node S, the source address is chosen (replacing
the asterisk), I1 becomes the destination address of the datagram,
the two addresses A[2] and A[3] are "shifted up", and D is moved to
A[3]. If f0 is IPV6_SRCRT_STRICT, then I1 must be a neighbor of S.
The columns of values that are shown beneath the destination node are
the values returned by recvmsg(), assuming the application has
enabled both the IPV6_PKTINFO and IPV6_SRCRT socket options. The
source address is S (contained in the sockaddr_in6 structure pointed
to by the msg_name member), the destination address is D (returned as
an ancillary data object in an in6_pktinfo structure), and the
ancillary data object specifying the source route will contain three
addresses (I1, I2, and I3) and four flags (f0, f1, f2, and f3). The
number of segments in the Routing header is known from the Hdr Ext
Len field in the Routing header (a value of 6, indicating 3
addresses).
The return value from inet6_srcrt_segments() will be 3 and
inet6_srcrt_getaddr(1) will return I1, inet6_srcrt_getaddr(2) will
return I2, and inet6_srcrt_getaddr(3) will return I3, The return
value from inet6_srcrt_flags(0) will be f0, inet6_srcrt_flags(1) will
return f1, inet6_srcrt_flags(2) will return f2, and
inet6_srcrt_flags(3) will return f3.
If the receiving application then calls inet6_srcrt_reverse(), the
order of the three addresses will become I3, I2, and I1, and the
order of the four Strict/Loose flags will become f3, f2, f1, and f0.
We can also show what an implementation might store in the ancillary
data object as the Routing header is being built by the sending
process. If we assume a 32-bit architecture where sizeof(struct
cmsghdr) equals 12, with a desired alignment of 4-byte boundaries,
then the call to inet6_srcrt_space(3) returns 68: 12 bytes for the
cmsghdr structure and 56 bytes for the Routing header (8 + 3*16).
The call to inet6_srcrt_init() initializes the ancillary data object
to contain a Type 0 Routing header:
Stevens & Thomas [Page 53]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_len = 20 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_level = IPPROTO_IPV6 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_type = IPV6_SRCRT |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Next Header | Hdr Ext Len=0 | Routing Type=0| Seg Left=0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved | Strict/Loose Bit Map |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The first call to inet6_srcrt_add() adds I1 to the list.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_len = 36 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_level = IPPROTO_IPV6 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_type = IPV6_SRCRT |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Next Header | Hdr Ext Len=2 | Routing Type=0| Seg Left=1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved |X| Strict/Loose Bit Map |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| |
+ Address[1] = I1 +
| |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Bit 0 of the Strict/Loose Bit Map contains the value f0, which we
just mark as X. cmsg_len is incremented by 16, the Hdr Ext Len field
is incremented by 2, and the Segments Left field is incremented by 1.
The next call to inet6_srcrt_add() adds I2 to the list.
Stevens & Thomas [Page 54]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_len = 52 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_level = IPPROTO_IPV6 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_type = IPV6_SRCRT |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Next Header | Hdr Ext Len=4 | Routing Type=0| Seg Left=2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved |X|X| Strict/Loose Bit Map |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| |
+ Address[1] = I1 +
| |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| |
+ Address[2] = I2 +
| |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The next bit of the Strict/Loose Bit Map contains the value f1.
cmsg_len is incremented by 16, the Hdr Ext Len field is incremented
by 2, and the Segments Left field is incremented by 1.
The last call to inet6_srcrt_add() adds I3 to the list.
Stevens & Thomas [Page 55]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_len = 68 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_level = IPPROTO_IPV6 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| cmsg_type = IPV6_SRCRT |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Next Header | Hdr Ext Len=6 | Routing Type=0| Seg Left=3 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved |X|X|X| Strict/Loose Bit Map |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| |
+ Address[1] = I1 +
| |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| |
+ Address[2] = I2 +
| |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| |
+ Address[3] = I3 +
| |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The next bit of the Strict/Loose Bit Map contains the value f2.
cmsg_len is incremented by 16, the Hdr Ext Len field is incremented
by 2, and the Segments Left field is incremented by 1.
Finally, the call to inet6_srcrt_lasthop() sets the next bit of the
Strict/Loose Bit Map to the value specified by f3. All the lengths
remain unchanged.
10. Ordering of Ancillary Data and IPv6 Extension Headers
Stevens & Thomas [Page 56]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
Three IPv6 extension headers can be specified by the application and
returned to the application using ancillary data with sendmsg() and
recvmsg(): Hop-by-Hop options, Destination options, and the Routing
header. When multiple ancillary data objects are transferred via
sendmsg() or recvmsg() and these objects represent any of these three
extension headers, their placement in the control buffer is directly
tied to their location in the corresponding IPv6 datagram. This API
imposes some ordering constraints when using multiple ancillary data
objects with sendmsg().
When multiple IPv6 Hop-by-Hop options having the same option type are
specified, these options will be inserted into the Hop-by-Hop options
header in the same order as they appear in the control buffer. But
when multiple Hop-by-Hop options having different option types are
specified, these options may be reordered by the kernel to reduce
padding in the Hop-by-Hop options header. Hop-by-hop options may
appear anywhere in the control buffer and will always be collected by
the kernel and placed into a single Hop-by-Hop options header that
immediately follows the IPv6 header.
Similar rules apply to the Destination options: (1) those of the same
type will appear in the same order as they are specified, and (2)
those of differing types may be reordered. But the kernel will build
up to two Destination options headers: one to precede the Routing
header and one to follow the Routing header. If the application
specifies a Routing header then all Destination options that appear
in the control buffer before the Routing header will appear in a
Destination options header before the Routing header and these
options might be reordered, subject to the two rules that we just
stated. Similarly all Destination options that appear in the control
buffer after the Routing header will appear in a Destination options
header after the Routing header, and these options might be
reordered, subject to the two rules that we just stated.
As an example, assume that an application specifies control
information to sendmsg() containing six ancillary data objects: the
first containing two Hop-by-Hop options, the second containing one
Destination option, the third containing two Destination options, the
fourth containing a source route, the fifth containing a Hop-by-Hop
option, and the sixth containing two Destination options. We also
assume that all the Hop-by-Hop options are of different types, as are
all the Destination options. We number these options 1-9,
corresponding to their order in the control buffer, and show them on
the left below.
In the middle we show the final arrangement of the options in the
extension headers built by the kernel. On the right we show the four
ancillary data objects returned to the receiving application.
Stevens & Thomas [Page 57]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
Sender's Receiver's
Ancillary Data --> IPv6 Extension --> Ancillary Data
Objects Headers Objects
------------------ --------------- --------------
HOPOPT-1,2 (first) HOPHDR(J,7,1,2) HOPOPT-7,1,2
DSTOPT-3 DSTHDR(4,5,3) DSTOPT-4,5,3
DSTOPT-4,5 RTGHDR(6) SRCRT-6
SRCRT-6 DSTHDR(8,9) DSTOPT-8,9
HOPOPT-7
DSTOPT-8,9 (last)
The sender's two Hop-by-Hop ancillary data objects are reordered, as
are the first two Destination ancillary data objects. We also show a
Jumbo Payload option (denoted as J) inserted by the kernel before the
sender's three Hop-by-Hop options. The first three Destination
options must appear in a Destination header before the Routing
header, and the final two Destination options must appear in a
Destination header after the Routing header.
If Destination options are specified in the control buffer after a
Routing header, or if Destination options are specified without a
Routing header, the kernel will place those Destination options after
an authentication header and/or an encapsulating security payload
header, if present.
11. IPv6-Specific Options with IPv4-Mapped IPv6 Addresses
The various socket options and ancillary data specifications defined
in this document apply only to true IPv6 sockets. It is possible to
create an IPv6 socket that actually sends and receives IPv4 packets,
using IPv4-mapped IPv6 addresses, but the mapping of the options
defined in this document to an IPv4 datagram is beyond the scope of
this document.
In general, attempting to specify an IPv6-only option, such as the
Hop-by-Hop options, Destination options, or Routing header on an IPv6
socket that is using IPv4-mapped IPv6 addresses, will probably result
in an error. Some implementations, however, may provide access to
the packet information (source/destination address, send/receive
interface, and hop limit) on an IPv6 socket that is using IPv4-mapped
IPv6 addresses.
12. rresvport_af
The rresvport() function is used by the rcmd() function, and this
Stevens & Thomas [Page 58]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
function is in turn called by many of the "r" commands such as
rlogin. While new applications are not being written to use the
rcmd() function, legacy applications such as rlogin will continue to
use it and these will be ported to IPv6.
rresvport() creates an IPv4/TCP socket and binds a "reserved port" to
the socket. Instead of defining an IPv6 version of this function we
define a new function that takes an address family as its argument.
#include <unistd.h>
int rresvport_af(int *port, int family);
This function behaves the same as the existing rresvport() function,
but instead of creating an IPv4/TCP socket, it can also create an
IPv6/TCP socket. The family argument is either AF_INET or AF_INET6,
and a new error return is EAFNOSUPPORT if the address family is not
supported.
(Note: There is little consensus on which header defines the
rresvport() and rcmd() function prototypes. 4.4BSD defines it in
<unistd.h>, others in <netdb.h>, and others don't define the function
prototypes at all.)
(Note: We define this function only, and do not define something like
rcmd_af() or rcmd6(). The reason is that rcmd() calls
gethostbyname(), which returns the type of address: AF_INET or
AF_INET6. It should therefore be possible to modify rcmd() to
support either IPv4 or IPv6, based on the address family returned by
gethostbyname().)
13. Future Items
Some additional items may require standardization, but no concrete
proposals have been made for the API to perform these tasks. These
may be addressed in a later document.
13.1. Path MTU Discovery and UDP
A standard method may be desirable for a UDP application to determine
the "maximum send transport-message size" (Section 5.1 of [3]) to a
given destination. This would let the UDP application send smaller
datagrams to the destination, avoiding fragmentation.
13.2. Neighbor Reachability and UDP
Stevens & Thomas [Page 59]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
A standard method may be desirable for a UDP application to tell the
kernel that it is making forward progress with a given peer (Section
7.3.1 of [4]). This could save unneeded neighbor solicitations and
neighbor advertisements.
14. Summary of New Definitions
The following list summarizes the constants and structure,
definitions discussed in this memo, sorted by header.
<netinet/icmp6.h> ICMPV6_DEST_UNREACH
<netinet/icmp6.h> ICMPV6_DEST_UNREACH_ADDR
<netinet/icmp6.h> ICMPV6_DEST_UNREACH_ADMIN
<netinet/icmp6.h> ICMPV6_DEST_UNREACH_NOPORT
<netinet/icmp6.h> ICMPV6_DEST_UNREACH_NOROUTE
<netinet/icmp6.h> ICMPV6_DEST_UNREACH_NOTNEIGHBOR
<netinet/icmp6.h> ICMPV6_ECHOREPLY
<netinet/icmp6.h> ICMPV6_ECHOREQUEST
<netinet/icmp6.h> ICMPV6_INFOMSG_MASK
<netinet/icmp6.h> ICMPV6_MGM_QUERY
<netinet/icmp6.h> ICMPV6_MGM_REDUCTION
<netinet/icmp6.h> ICMPV6_MGM_REPORT
<netinet/icmp6.h> ICMPV6_PACKET_TOOBIG
<netinet/icmp6.h> ICMPV6_PARAMPROB
<netinet/icmp6.h> ICMPV6_PARAMPROB_HEADER
<netinet/icmp6.h> ICMPV6_PARAMPROB_NEXTHEADER
<netinet/icmp6.h> ICMPV6_PARAMPROB_OPTION
<netinet/icmp6.h> ICMPV6_TIME_EXCEEDED
<netinet/icmp6.h> ICMPV6_TIME_EXCEED_HOPS
<netinet/icmp6.h> ICMPV6_TIME_EXCEED_REASSEMBLY
<netinet/icmp6.h> ND6_NADVERFLAG_ISROUTER
<netinet/icmp6.h> ND6_NADVERFLAG_OVERRIDE
<netinet/icmp6.h> ND6_NADVERFLAG_SOLICITED
<netinet/icmp6.h> ND6_NEIGHBOR_ADVERTISEMENT
<netinet/icmp6.h> ND6_NEIGHBOR_SOLICITATION
<netinet/icmp6.h> ND6_OPT_ENDOFLIST
<netinet/icmp6.h> ND6_OPT_MTU
<netinet/icmp6.h> ND6_OPT_PI_A_BIT
<netinet/icmp6.h> ND6_OPT_PI_L_BIT
<netinet/icmp6.h> ND6_OPT_PREFIX_INFORMATION
<netinet/icmp6.h> ND6_OPT_REDIRECTED_HEADER
<netinet/icmp6.h> ND6_OPT_SOURCE_LINKADDR
<netinet/icmp6.h> ND6_OPT_TARGET_LINKADDR
<netinet/icmp6.h> ND6_RADV_M_BIT
<netinet/icmp6.h> ND6_RADV_O_BIT
<netinet/icmp6.h> ND6_REDIRECT
<netinet/icmp6.h> ND6_ROUTER_ADVERTISEMENT
Stevens & Thomas [Page 60]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
<netinet/icmp6.h> ND6_ROUTER_SOLICITATION
<netinet/icmp6.h> enum nd6_option{};
<netinet/icmp6.h> struct icmp6_filter{};
<netinet/icmp6.h> struct icmp6_hdr{};
<netinet/icmp6.h> struct nd6_nadvertisement{};
<netinet/icmp6.h> struct nd6_nsolicitation{};
<netinet/icmp6.h> struct nd6_opt_mtu{};
<netinet/icmp6.h> struct nd6_opt_prefix_info{};
<netinet/icmp6.h> struct nd6_redirect{};
<netinet/icmp6.h> struct nd6_router_advert{};
<netinet/icmp6.h> struct nd6_router_solicit{};
<netinet/in.h> IPPROTO_AH
<netinet/in.h> IPPROTO_DSTOPTS
<netinet/in.h> IPPROTO_ESP
<netinet/in.h> IPPROTO_FRAGMENT
<netinet/in.h> IPPROTO_HOPOPTS
<netinet/in.h> IPPROTO_ICMPV6
<netinet/in.h> IPPROTO_IPV6
<netinet/in.h> IPPROTO_NONE
<netinet/in.h> IPPROTO_ROUTING
<netinet/in.h> IPV6_DSTOPTS
<netinet/in.h> IPV6_HOPLIMIT
<netinet/in.h> IPV6_HOPOPTS
<netinet/in.h> IPV6_NEXTHOP
<netinet/in.h> IPV6_PKTINFO
<netinet/in.h> IPV6_PKTOPTIONS
<netinet/in.h> IPV6_SRCRT
<netinet/in.h> IPV6_SRCRT_LOOSE
<netinet/in.h> IPV6_SRCRT_STRICT
<netinet/in.h> IPV6_SRCRT_TYPE_0
<netinet/in.h> struct in6_pktinfo{};
<netinet/ip6.h> struct ip6_hdr{};
<sys/socket.h> struct cmsghdr{};
<sys/socket.h> struct msghdr{};
The following list summarizes the function and macro prototypes
discussed in this memo, sorted by header.
<netinet/icmp6.h> void ICMPV6_FILTER_SETBLOCK(int, struct icmp6_filter *);
<netinet/icmp6.h> void ICMPV6_FILTER_SETBLOCKALL(struct icmp6_filter *);
<netinet/icmp6.h> void ICMPV6_FILTER_SETPASS(int, struct icmp6_filter *);
<netinet/icmp6.h> void ICMPV6_FILTER_SETPASSALL(struct icmp6_filter *);
<netinet/icmp6.h> int ICMPV6_FILTER_WILLBLOCK(int,
const struct icmp6_filter *);
Stevens & Thomas [Page 61]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
<netinet/icmp6.h> int ICMPV6_FILTER_WILLPASS(int,
const struct icmp6_filter *);
<netinet/in.h> int IN6_ARE_ADDR_EQUAL(const struct in6_addr *,
const struct in6_addr *);
<netinet/in.h> int inet6_flow_assign(int, struct sockaddr_in6 *,
const void *, size_t);
<netinet/in.h> int inet6_flow_free(int, const struct sockaddr_in6 *);
<netinet/in.h> int inet6_flow_reuse(int, int,
const struct sockaddr_in6 *);
<netinet/in.h> u_int8_t *inet6_option_alloc(struct cmsghdr *,
int, int, int);
<netinet/in.h> int inet6_option_append(struct cmsghdr *,
const u_int8_t *, int, int);
<netinet/in.h> int inet6_option_find(const struct cmsghdr *,
u_int8_t *, int);
<netinet/in.h> int inet6_option_init(void *, struct cmsghdr **, int);
<netinet/in.h> int inet6_option_next(const struct cmsghdr *,
u_int8_t **);
<netinet/in.h> int inet6_option_space(int);
<netinet/in.h> int inet6_srcrt_add(struct cmsghdr *,
const struct in6_addr *,
unsigned int);
<netinet/in.h> struct in6_addr inet6_srcrt_getaddr(struct cmsghdr *,
int);
<netinet/in.h> int inet6_srcrt_getflags(const struct cmsghdr *, int);
<netinet/in.h> struct cmsghdr *inet6_srcrt_init(void *, int);
<netinet/in.h> int inet6_srcrt_lasthop(struct cmsghdr *, unsigned int);
<netinet/in.h> int inet6_srcrt_reverse(const struct cmsghdr *,
struct cmsghdr *);
<netinet/in.h> int inet6_srcrt_segments(const struct cmsghdr *);
<netinet/in.h> size_t inet6_srcrt_space(int, int);
<sys/socket.h> unsigned char *CMSG_DATA(const struct cmsghdr *);
<sys/socket.h> struct cmsghdr *CMSG_FIRSTHDR(const struct msghdr *);
<sys/socket.h> unsigned int CMSG_LEN(unsigned int);
<sys/socket.h> struct cmsghdr *CMSG_NXTHDR(const struct msghdr *mhdr,
const struct cmsghdr *);
<sys/socket.h> unsigned int CMSG_SPACE(unsigned int);
<unistd.h> int rresvport_af(int *, int);
Stevens & Thomas [Page 62]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
15. Security Considerations
Allowing an application to pick flow labels at will could permit
interference with the routing of packets sent by another application
from the same host, or theft of a bandwidth reservation or other
network state created on behalf of another user.
The setting of certain Hop-by-Hop options and Destination options may
be restricted to privileged processes. Similarly some Hop-by-Hop
options and Destination options may not be returned to nonprivileged
applications.
16. Change History
Changes from the February 1997 Edition (-01 draft)
- Changed the name of the ip6hdr structure to ip6_hdr (Section 2.1)
for consistency with the icmp6hdr structure. Also changed the
name of the ip6hdrctl structure contained within the ip6_hdr
structure to ip6_hdrctl (Section 2.1). Finally, changed the name
of the icmp6hdr structure to icmp6_hdr (Section 2.2). All other
occurrences of this structure name, within the Neighbor Discovery
structures in Section 2.2.1, already contained the underscore.
- The "struct nd_router_solicit" and "struct nd_router_advert"
should both begin with "nd6_". (Section 2.2.2).
- Changed the name of in6_are_addr_equal to IN6_ARE_ADDR_EQUAL
(Section 2.3) for consistency with basic API address testing
functions. The header defining this macro is <netinet/in.h>.
- getprotobyname("ipv6") now returns 41, not 0 (Section 2.4).
- The first occurrence of "struct icmpv6_filter" in Section 3.2
should be "struct icmp6_filter".
- Changed the name of the CMSG_LENGTH() macro to CMSG_LEN()
(Section 4.3.5), since LEN is used throughout the <netinet/*.h>
headers.
- Corrected the argument name for the sample implementations of the
CMSG_SPACE() and CMSG_LEN() macros to be "length" (Sections 4.3.4
and 4.3.5).
- Corrected the socket option mentioned in Section 5.1 to specify
the interface for multicasting from IPV6_ADD_MEMBERSHIP to
IPV6_MULTICAST_IF.
Stevens & Thomas [Page 63]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
- There were numerous errors in the previous draft that specified
<netinet/ip6.h> that should have been <netinet/in.h>. These have
all been corrected and the locations of all definitions is now
summarized in the new Section 14 ("Summary of New Definitions").
Changes from the October 1996 Edition (-00 draft)
- Numerous rationale added using the format (Note: ...).
- Added note that not all errors may be defined.
- Added note about ICMPv4, IGMPv4, and ARPv4 terminology.
- Changed the name of <netinet/ip6_icmp.h> to <netinet/icmp6.h>.
- Change some names in Section 2.2.1: ICMPV6_PKT_TOOBIG to
ICMPV6_PACKET_TOOBIG, ICMPV6_TIME_EXCEED to ICMPV6_TIME_EXCEEDED,
ICMPV6_ECHORQST to ICMPV6_ECHOREQUEST, ICMPV6_ECHORPLY to
ICMPV6_ECHOREPLY, ICMPV6_PARAMPROB_HDR to
ICMPV6_PARAMPROB_HEADER, ICMPV6_PARAMPROB_NXT_HDR to
ICMPV6_PARAMPROB_NEXTHEADER, and ICMPV6_PARAMPROB_OPTS to
ICMPV6_PARAMPROB_OPTION.
- Prepend the prefix "icmp6_" to the three members of the
icmp6_dataun union of the icmp6hdr structure (Section 2.2).
- Moved the neighbor discovery definitions into the
<netinet/icmp6.h> header, instead of being in their own header
(Section 2.2.1).
- Changed Section 2.3 ("Address Testing"). The basic macros are
now in the basic API.
- Added the new Section 2.4 on "Protocols File".
- Added note to raw sockets description that something like BPF or
DLPI must be used to read or write entire IPv6 packets.
- Corrected example of IPV6_CHECKSUM socket option (Section 3.1).
Also defined value of -1 to disable.
- Noted that <netinet/icmp6.h> defines all the ICMPv6 filtering
constants, macros, and structures (Section 3.2).
- Added note on magic number 10240 for amount of ancillary data
(Section 4.1).
- Added possible padding to picture of ancillary data (Section
Stevens & Thomas [Page 64]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
4.2).
- Defined <sys/socket.h> header for CMSG_xxx() functions (Section
4.2).
- Note that the data returned by getsockopt(IPV6_PKTOPTIONS) for a
TCP socket is just from the optional headers, if present, of the
most recently received segment. Also note that control
information is never returned by recvmsg() for a TCP socket.
- Changed header for struct in6_pktinfo from <netinet.in.h> to
<netinet/ip6.h> (Section 5).
- Removed the old Sections 5.1 and 5.2, because the interface
identification functions went into the basic API.
- Redid Section 5 to support the hop limit field.
- New Section 5.4 ("Next Hop Address").
- New Section 6 ("Flow Labels").
- Changed all of Sections 7 and 8 dealing with Hop-by-Hop and
Destination options. We now define a set of inet6_option_XXX()
functions.
- Changed header for IPV6_SRCRT_xxx constants from <netinet.in.h>
to <netinet/ip6.h> (Section 9).
- Add inet6_srcrt_lasthop() function, and fix errors in description
of source routing (Section 9).
- Reworded some of the source routing descriptions to conform to
the terminology in [1].
- Added the example from [1] for the Routing header (Section 9.9).
- Expanded the example in Section 10 to show multiple options per
ancillary data object, and to show the receiver's ancillary data
objects.
- New Section 11 ("IPv6-Specific Options with IPv4-Mapped IPv6
Addresses").
- New Section 12 ("rresvport_af").
- Redid old Section 10 ("Additional Items") into new Section 13
Stevens & Thomas [Page 65]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
("Future Items").
17. References
[1] Deering, S., Hinden, R., "Internet Protocol, Version 6 (IPv6),
Specification", RFC 1883, Dec. 1995.
[2] Gilligan, R. E., Thomson, S., Bound, J., Stevens, W., "Basic
Socket Interface Extensions for IPv6", Internet-Draft, draft-
ietf-ipngwg-bsd-api-07.txt, January 1997.
[3] McCann, J., Deering, S., Mogul, J, "Path MTU Discovery for IP
version 6", RFC 1981, Aug. 1996.
[4] Narten, T., Nordmark, E., Simpson, W., "Neighbor Discovery for
IP Version 6 (IPv6)", RFC 1970, Aug. 1996.
[5] Braden, R., Zhang, L., Berson, S., Herzog, S., Jamin, S.,
"Resource ReSerVation Protocol (RSVP) -- Version 1 Functional
Specification", Internet-Draft, draft-ietf-rsvp-spec-14.txt,
November 1996.
18. Acknowledgments
Matt Thomas and Jim Bound have been working on the technical details
in this draft for over a year. Keith Sklower is the original
implementor of ancillary data in the BSD networking code. Craig Metz
provided lots of feedback, suggestions, and comments based on his
implementing many of these features as the document was being
written.
Matt Crawford designed the flow label interface.
The following provided comments on earlier drafts: Hamid Asayesh, Ran
Atkinson, Karl Auerbach, Matt Crawford, Sam T. Denton, Richard
Draves, Francis Dupont, Bob Gilligan, Tim Hartrick, Masaki Hirabaru,
Yoshinobu Inoue, Mukesh Kacker, A. N. Kuznetsov, der Mouse, John Moy,
Thomas Narten, Erik Nordmark, Tom Pusateri, Pedro Roque, Sameer Shah,
Peter Sjodin, Stephen P. Spackman, Quaizar Vohra, Carl Williams,
Steve Wise, and Kazu Yamamoto.
19. Authors' Addresses
Stevens & Thomas [Page 66]
INTERNET-DRAFT Advanced Sockets API for IPv6 March 26, 1997
W. Richard Stevens
1202 E. Paseo del Zorro
Tucson, AZ 85718
Email: rstevens@kohala.com
Matt Thomas
AltaVista Internet Software
LJO2-1/J8
30 Porter Rd
Littleton, MA 01460
Email: mattthomas@earthlink.net
Stevens & Thomas [Page 67]