Disclaimer: These are planning documents. The functionalities described here may be unimplemented, partially implemented, or implemented differently than the original design.

Circuit and Service Identifiers as Rust structs

Summary

In Splinter’s Rust code (libsplinter, splinterd, etc.), use structs instead of strings to represent circuit and service identifiers.

Motivation

Throughout Splinter, circuit and service identifiers are passed as String or &str, offering no compiler-enforced guarantees as to whether the passed in String or &str is a well-formed identifier.

Since the set of valid identifiers is substantially smaller than the set of valid strings, the functions which accept strings should be validating the passed-in arguments for correctness and throwing an InvalidArgumentError if the argument is not well-formed. In part because of the extent to which the identifiers are passed around, the current function implementations do not consistently do this check. Instead, the functions often assume that the caller will only pass in valid strings. As a result, the contract between caller and the functions has implied rules which could easily be violated at runtime, and without explicit checks for invalid arguments, the behavior of the functions when invalid strings are provided is potentially undefined (in that it is not supported by the functions explicitly). By using structs instead of using strings, the compiler can enforce the function arguments as correct and the opportunity for these types of runtime errors is removed completely.

A secondary issue with passing strings instead of structs occurs when the strings must be parsed, potentially resulting in parsing errors. In functions which parse the strings, a side-effect is the partial validation of the string which potentially results in an error (though currently the return of InvalidArgumentError is not consistent, because this error was introduced after much of the code was written). The parsing is often ad hoc. By using structs, the opportunity for these runtime errors is removed from all code except for code which calls the struct’s constructors. By creating the structs early and using them throughout the rest of the code, the opportunity for runtime parsing errors in low-level functions is removed. Parsing the string multiple times is also avoided.

Guide-level Explanation

Circuit Identifiers

A valid circuit identifier is a string of length 11 with the following structure:

  • characters 0-4 are alphanumeric
  • character 5 is a -
  • characters 6-10 are alphanumeric

A circuit identifier string may be converted to an integer by removing the - character and then using base62 conversion. Likewise, the reverse operation can be performed to convert an integer to a circuit identifier string.

The circuit identifier 00000-00000 is reserved for use as the management (admin) circuit.

To represent a circuit identifier, we add a struct called CircuitId.

Service Identifiers

A service id consists of a string, with one of the following formats:

  • 4 character alphanumeric string (non-management circuits)
  • a public key hex string (management circuit only)
  • a node identifier (management circuit only)

Service ids can be converted to an integer and back using base62 encoding.

To represent a service identifier, we add a struct called ServiceId.

Fully Qualified Service Identifiers

It is common to combine circuit and service identifiers into a single string, of the format <circuit_id>::<service_id>. This is called a fully-qualified service identifier and is supported with the struct FullyQualifiedServiceId.

FullyQualifiedServiceId enforces valid combinations only; for example, allowing public key hex string service identifiers on the management circuit, but not non-management circuits.

Examples

The following are example valid circuit ids:

  • 00000-00000
  • ABCDE-01234
  • foA8k-03kAM

The following are examples of valid service ids on non-management circuits:

  • 00aa
  • 45R3
  • Amrk

The following are examples of valid service ids on the management circuit:

  • node1
  • 02342b593af807a10e202c878253f69101c5d8e51ef6304acd741c54c3fa6011a3
  • 03f8288acfa95e6f35c58ca9b7dc133e095157d8c99703c0c0355a968f2ace1a42

The following are examples of valid fully-qualified service ids:

  • ABCDE-01234::00aa
  • foA8k-03kAM::45R3
  • 00000-00000::node1
  • 00000-00000::02342b593af807a10e202c878253f69101c5d8e51ef6304acd741c54c3fa6011a3
  • 00000-00000::03f8288acfa95e6f35c58ca9b7dc133e095157d8c99703c0c0355a968f2ace1a42

Reference-level Explanation

The following structs become part of libsplinter’s public API:

  • splinter::service::CircuitId
  • splinter::service::ServiceId
  • splinter::service::FullyQualfiedServiceId

When accepting a string, Into<String> will be used to support a wide array of arguments without requiring explicit string conversion by the caller. Explicitly, it is desirable to support construction directly from the following types:

  • &str
  • Box<str>
  • String

Any invalid string provided to a constructor will result in an InvalidArgumentError.

All structs derive the following traits: Clone, Debug, Hash, PartialEq, Eq.

CircuitId

The CircuitId struct will contain the following public functions in its implementation:

impl CircuitId {
    pub fn new<T: Into<String>>(circuit_id: T) -> Result<Self, InvalidArgumentError> { ... }
    pub fn new_random() -> Self { ... }
    pub fn as_str&self) -> &str { ... }
    pub fn deconstruct(self) -> Box<str> { ... }
}

The following additional traits will be implemented for CircuitId:

impl TryFrom<String> for CircuitId { ... }
impl TryFrom<Box<str>> for CircuitId { ... }
impl TryFrom<&str> for CircuitId { ... }
impl std::fmt::Display for CircuitId { ... }

The combination of deconstruct() and TryFrom<Box<str>> provides a method of deconstruction and reconstruction without incurring any additional allocation.

ServiceId

The ServiceId struct will contain the following public functions in its implementation:

impl ServiceId {
    pub fn new<T: Into<String>>(service_id: T) -> Result<Self, InvalidArgumentError> { ... }
    pub fn new_random() -> Self { ... }
    pub fn identity(&self) -> &ServiceIdentity { ... }
    pub fn as_str&self) -> &str { ... }
    pub fn deconstruct(self) -> (Box<str>, ServiceIdentity) { ... }
}

The following additional traits will be implemented for ServiceId:

impl TryFrom<String> for ServiceId { ... }
impl TryFrom<Box<str>> for ServiceId { ... }
impl TryFrom<(Box<str>, ServiceIdentity)> for ServiceId { ... }
impl TryFrom<&str> for ServiceId { ... }
impl std::fmt::Display for ServiceId { ... }

The combination of deconstruct() and TryFrom<(Box<str>, ServiceIdentity)> provides a method of deconstruction and reconstruction without incurring any additional allocation.

In order to support returning the identity information packed within a service id, the following enum is defined:

pub enum ServiceIdentity {
    Normal(String),
    NodeId(String),
    PublicKey(PublicKey),
}

The PublicKey struct used is cylinder::PublicKey and is created using PublicKey::new_from_hex(...).

FullyQualifiedServiceId

The FullyQualifiedServiceId struct will contain the following public functions in its implementation:

impl FullyQualifiedServiceId {
    pub fn new(circuit_id: CircuitId, service_id: ServiceId) -> Self { ... }
    pub fn new_from_string<T: AsRef<str>>(fully_qualified_service_id: T) -> Result<Self, InvalidArgumentError> { ... }
    pub fn new_random() -> Self { ... }
    pub fn circuit_id(&self) -> &CircuitId { ... }
    pub fn service_id(&self) -> &ServiceId { ... }
    pub fn deconstruct(self) -> (CircuitId, ServiceId) { ... }
}

The new_random() function will create a normal non-management identifier.

Drawbacks

Integrating this concept into the existing codebase is a complex undertaking due to the extent of code which passes circuit and service identifiers around.

A node identifier is currently any valid UTF-8 string; this design proposes restricting it to a base 62 string. The namespace for a public key hex string and a node identifier overlap and a collision is possible, though it could be considered a configuration error. Further restrictions on node identifier could possibly resolve this issue.

The management circuit is currently designated by the string admin, not by the string 00000-00000. Thus, conversion between the two will be a necessity for backward compatibility.

The service identifiers used on the management circuit today are not 1:1 with the ones defined in this design. Today, fully-qualified service identifiers can have the format:

  • admin::<node_id>
  • admin::public_key::<remote_public_key>::public_key::<local_public_key>

The first is the same (with admin replaced with 00000-00000 as noted above), but the second one has two public keys. This form of service identifier captures both local and remote public keys, which is used when determining the correct peer connection; in this design, however, only the one public key needed to refer to the service is present. A requirement exists to be able to determine the proper PeerTokenPair in order to find the proper peer connection. While today that can be derived from a single service id, with this new format, it will require both the sender and destination service identifiers in order to find the correct PeerTokenPair. Additional design work will be necessary to figure out the best way to handle the implementation of this change.

The API for ServiceIdentity uses cylinder::PublicKey, which is an external crate.

Rationale and Alternatives

The ServiceIdentifier struct exposes admin service design/functionality outside of the admin service itself. This is an intentional decision to consider the definition of both circuit identifiers and service identifiers in their entirety. The primary motivator, however, is that it moves the creation of the PublicKey to the ServiceId’s constructor, thus forcing runtime errors in hex conversion to happen earlier in the process and requiring less error handling overall.