Translate

Saturday, 21 March 2015

NMEA-0183 over- IP: The unwritten rules for programmers

An earlier post presented a subjective overview of the current non-standardised use of the application layer of NMEA-0183 by independent and open source developers.  As discussed there, NMEA-0183 is a proprietary "whole stack" protocol running over serial lines.  While a proprietary standard method for transporting the highest layer of this protocol over IP networks does exist, it is not a standard used by open source programmers or indeed the vast majority of people writing marine data applications today.

This post is aimed at people writing applications which produce or consume marine data and wondering how to maximise their interoperability with other applications.  To re-emphasise: there are no agreed rules.  The basis for this post is 3 years of observing "how everyone else does it" and hacking and debugging networking with both kplex and OpenCPN in the context of other data sources and consumers.

The Basics

The canonical publicly-available reference for NMEA-0183 is Eric Raymond's "NMEA Revealed".  This post will not attempt to repeat the contents of that document and familiarity with the structure of  NMEA-0183 as explained there will be assumed for the remainder of this post.  Here we focus on how other application developers commonly put that information over Internet Protocol.

For open source programmers, purchasing the NMEA-0183 protocol document is probably not an option: publishing source code based on the standard may be considered a violation of copyright in some countries. I attempted to clarify this with the NMEA's legal department but they declined to reply to my email.

Application Layer Structure

Application layer structure tends to be almost identical to the documented serial line standard.

Start

The starting "$" or "!" is always included as part of the transmitted sentence. You should expect it when receiving and produce it when transmitting.

Body

Not all applications enforce correct line length but there should be no need to exceed what many sources cite as the maximum of 80 characters including the initial "!" or "$" but excluding terminating line delimiters.  Applications should expect longer lines to be rejected by receiving applications and there is generally no reason not to reject longer lines.

At time of writing neither OpenCPN nor kplex makes specific checks for "illegal" characters in sentence bodies.  Legal vs. illegal NMEA-0183 characters do not seem well documented in publicly available texts.  "$", "!", "*", the line feed character <LF> and carriage return <CR> are all likely to be used by parsers to divide up sentences. 

Line termination

Significant variation seems to exist in what applications transmit.  The "correct" NMEA-0183 sentence termination sequence <CR><LF> (carriage-return linefeed, 0x0D0x0A) is most common (and used by OpenCPN), should be accepted by all parsers, and should be used to terminate sentences when transmitting data.  Some applications terminate sentences with just a <LF> and I've encountered one android application which transmitted GPS sentences terminated with just a <CR>.  For maximum compatibility, receiving applications should accept lines terminated with <CR> or <LF> and ignore subsequent characters until one marking the start of new data of interest (i.e. "$", "!" or (if supporting TAG blocks) "\").  

Parsing

The wrong way to parse data is to do what kplex did in early iterations:  Assume everything between two <CR><LF> sequences constitutes a sentence and and then check this string for "correct" structure.  A better way (used by current (at time of writing) versions of OpenCPN and kplex) is to ignore all characters until a "!" or "$" marking the beginning of a sentence (or "\" denoting the beginning of a TAG block if supporting them) then read characters until the end of the sentence and ignore everything after a terminating <CR> or <LF> delimiter until a new start of data character is seen.

Publicly available documents describing the introduction of TAG blocks in version 4 of the standard state that TAG blocks will be ignored by pre-existing NMEA-0183 parsers.  This tends to imply that the latter method is "correct".  Under that scheme TAG blocks will be ignored if not specifically supported.  Use of the latter scheme will also mean sentences are not rejected if inter-sentence line noise results in serial-to-IP converters inserting spurious characters into a data stream and will make accommodation of multiple line terminators (i.e. any combination of <CR> and <LF>) easier to code.

TAG blocks

TAG blocks as detailed here and here, are not widely supported but it has been reported that at least one hardware multiplexer aimed at recreational boaters produces them. OpenCPN's parsing strips TAG blocks. kplex recognises them, validates them, then discards them but is also capable of producing some TAG blocks.  If not explicitly supporting them, applications should ensure their parsing routines will silently discard TAG blocks without discarding the associated sentence, as discussed above.   If your application produces TAG blocks, providing an option to disable them may be useful in case the user has another application whose parsing rules have problems with them.

Transport and Network Layers

Network Layer

A well-structured post would work its way down the stack (or up) but let's get the network layer out of the way first.  It's IPv4.  With the exception of kplex I am not aware of any marine devices or applications which explicitly support NMEA-0183 over IPv6.  IPv6 may be implicitly supported on some platforms where the development framework takes care of the dirty networking details. It is not supported by OpenCPN.  To date out of hundreds of kplex users I have corresponded with, only one was using IPv6.  Do support IPv6 just because it's the decent thing to do.  Just don't expect anyone to use it.

Transport Layer

The majority of applications and devices expect data to be transmitted and received over either UDP broadcast or TCP.  For maximum interoperability both of these methods should be supported.  In a previous post I advocated UDP multicast as the optimum transport for NMEA-0183-style data.  OpenCPN supports this.  Kplex supports it.  Very little else seems to.  Please do implement it: it really isn't hard (the update to OpenCPN to support it was trivial) but as with IPv6, don't expect people to thank you for Doing The Right Thing.

UDP unicast is supported by some devices and applications.  It seems to be the preferred method for sending data to some AIS consolidation sites although most of these (including marine traffic) also seem to support TCP connections.

Packetization

Is that even a word?

TCP is a stream-based protocol so we simply write to it as we would a serial line and let TCP worry about dividing up the data (with a small caveat discussed later).  For UDP the question arises how we should break sentences between datagrams.  Should we write one sentence per datagram or fill a datagram with sentences before sending?  This question is generally not relevant to interoperability.  Receivers I have examined simply read data from a socket without being concerned about packet boundaries.  The choice of how to send is usually one of expediency.  Sending one sentence per packet incurs a higher degree of network protocol overhead relative to the amount of data sent.  Buffering packets to send multiple packets in a single datagram introduces delay.  The latter approach is also more difficult as it requires awareness of the maximum size of a datagram if fragmentation is to be avoided.  Assuming that a datagram can accommodate 82 bytes of a single NMEA-0183 sentence is not unreasonable.  The low data rates generally associated with NMEA-0183 mean that additional protocol overhead from one sentence per packet should not put undue load on a network. As this is easier to code and involves less delay this approach is my choice.

One exception to this is transmission of multi-sentence AIS data.  As sentences have to be reassembled, no additional delay is introduced by buffering parts of a multi-sentence AIS message.  Some AIS data consolidation sites such as localizatodo.com which do not support separate ports for each client rely on all parts of the message being transmitted in a single packet for the message to be correctly reassembled.  Few applications seem to concern themselves with buffering multi-sentence AIS messages to send in a single datagram.

In summary: for maximum compatibility ignore packet boundaries when reading.  With the exception mentioned above, how you packetize over UDP shouldn't affect compatibility but one sentence per packet would be my preferred choice.

Port

10110 is the port the NMEA have registered with the IANA for data over both UDP and TCP.  10110 is used as the default port by OpenCPN but there is a wide range of ports used by other devices and applications so this should always be user configurable.  Kplex's approach is first to use a user-defined port.  If none is specified it looks for "nmea-0183" in the system services database and if not found falls back to a default of 10110.

Network Addresses

Broadcast

Some devices and applications seem to follow the often-frowned-upon practice of sending broadcast data to the broadcast address of the zero network, i.e. to 255.255.255.255.  Better practice is to use the sending system's subnet broadcast address.

Applications and devices receiving UDP are rarely coded to care about the address to which data it receives was sent and with the exception of kplex, every application I have seen simply binds a receiving UDP socket to INADDR_ANY.  kplex can of course be told to do things like listen on a particular network interface which OpenCPN cannot.  To my knowledge, no-one has raised this as an issue for OpenCPN and in most environments end users simply won't care.

Multicast

Ignoring the proprietary IEC-61162-4 standard, there is no default or commonly agreed multicast group for NMEA-0183 over IP used by open source applications or devices aimed at recreational boaters.  This should therefore be end-user configurable. If picking a default I would choose one from the IPv4 organisation-local range (239.192.0.0/14 as described in RFC 2365) or the site-local IPv6 range (ffx5::/16 as described in RFC 4291)

TCP Considerations

Nagle Algorithm

If you accept the arguments given for sending one sentence per UDP packet, you'll also want to set TCP_NODELAY on sending TCP sockets.  kplex and OpenCPN do this.  Network analysis of some other devices suggests that they don't.  To disable the Nagle algorithm or not makes no difference to interoperability.

Service Discovery

UDP broadcast does not require service discovery.  There is no subscribe/publish mechanism for a client to request NMEA-0183 data over unicast UDP from a server so there is no need for service discovery for UDP.

For TCP, finding a server for NMEA-0183 data on the network can be an issue in end users' boat networks where devices are configured with dynamic addresses.

Products intended for use under Navico's "GoFree" brand are the only ones I am aware of making use of service discovery for locating NMEA-0183 data sources.  GoFree uses two service discovery mechanisms.  One is Apple's bonjour mechanism with services announced as "_nmea-0183._tcp". The other is Navico's own JSON-based service announcements sent to multicast group 239.2.1.1 port 2052 as detailed in their "tier 1" specification document

Applications should not assume that any programs other than those specifically designed to work with GoFree support such service discovery.  OpenCPN does not (either as client or server). kplex can listen for Navico service announcements in order to locate a server, but does not advertise itself using them.

As a more generic and widely supported protocol, applications which do wish to leverage service discovery should probably opt for bonjour with the service "_nmea-0183._tcp"

Miscellaneous

What about baud rate?

The answer may be obvious but I have seen this question asked so it is worth addressing.  This isn't an issue.  Baud is simply the transmission rate used on a serial line. It is not a property of the data which is transmitted.  The slowest rates commonly used on IP networks today, wired or wireless, are many times faster than the fastest speeds NMEA data are commonly transmitted over serial lines. Generally speaking programmers don't have to worry about transmission rates for this kind of data over IP.

Is ethernet different from wireless?

As far as the average application programmer is concerned?  No.  In some cases the physical medium over which some kinds of marine data is transmitted can matter as discussed in the previous post.  At the kinds of data rates at which NMEA-0183 is transmitted the programmer should not need to care whether their application is running on a wired or wireless network.

Summary

For maximum interoperability with other applications, applications using the application layer of NMEA-0183 over IP should:
  • Use UDP broadcast and TCP over IPv4 as transports
  • Use ASCII strings as they would be sent over a serial line with no additional encapsulation, starting from an initial "!" or "$", ending with the sequence <CR><LF>.
  • Use NMEA-0183 checksums
  • Accept sentences terminated with <CR> or <LF> and ignore all subsequent data until the start of a new sentence
  • Observe a maximum sentence length of 80 characters excluding the line termination characters
  • Use a default but configurable port of 10110 for both UDP and TCP
  • Ensure that TAG blocks, if not supported, are gracefully ignored without discarded their associated sentences
But it would also be nice if applications:
  • Support  UDP multicast
  • Support IPv6 as well as IPv4
  • Send data as soon as usable information is available for sending: One sentence or all sentences forming part of a multi-sentence message in one datagram in the case of UDP, Nagle algorithm disabled in the case of TCP
  • Use subnet broadcast addresses rather than 255.255.255.255 when sending IPv4 broadcast