Unmarshalling

“Unmarshalling” is the process of converting some kind of a lower-level representation, often a “wire format”, into a higher-level (object) structure. Other popular names for it are “Deserialization” or “Unpickling”.

In spray “Unmarshalling” means the conversion of an HttpEntity, the model class for the entity body of an HTTP request or response (depending on whether used on the client or server side), into an object of type T.

Unmarshalling for instances of type T is performed by an Unmarshaller[T], which is defined like this:

type Unmarshaller[T] = Deserializer[HttpEntity, T]
trait Deserializer[A, B] extends (A => Deserialized[B])
type Deserialized[T] = Either[DeserializationError, T]

So, an Unmarshaller is basically a function HttpEntity => Either[DeserializationError, T]. When compared to their counterpart, Marshallers, Unmarshallers are somewhat simpler, since they are straight functions and do not have to deal with chunk streams (which are currently not supported in unmarshalling) or delayed execution.)

Default Unmarshallers

spray-httpx comes with pre-defined Unmarshallers for the following types:

  • Array[Byte]
  • Array[Char]
  • String
  • NodeSeq
  • Option[T]
  • spray.http.FormData
  • spray.http.HttpForm
  • spray.http.MultipartContent
  • spray.http.MultipartFormData

The relevant sources are:

Implicit Resolution

Since the unmarshalling infrastructure uses a type class based approach Unmarshaller instances for a type T have to be available implicitly. The implicits for all the default Unmarshallers defined by spray-httpx are provided through the companion object of the Deserializer trait (since Unmarshaller[T] is just an alias for a Deserializer[HttpEntity, T]). This means that they are always available and never need to be explicitly imported. Additionally, you can simply “override” them by bringing your own custom version into local scope.

Custom Unmarshallers

spray-httpx gives you a few convenience tools for constructing Unmarshallers for your own types. One is the Unmarshaller.apply helper, which is defined as such:

def apply[T](unmarshalFrom: ContentTypeRange*)
            (f: PartialFunction[HttpEntity, T]): Unmarshaller[T]

The default NodeSeqUnmarshaller for example is defined with it:

implicit val NodeSeqUnmarshaller =
  Unmarshaller[NodeSeq](`text/xml`, `application/xml`, `text/html`, `application/xhtml+xml`) {
    case HttpEntity.NonEmpty(contentType, data) 
      XML.withSAXParser(createSAXParser())
        .load(new InputStreamReader(new ByteArrayInputStream(data.toByteArray), contentType.charset.nioCharset))
    case HttpEntity.Empty  NodeSeq.Empty
  }

As another example, here is an Unmarshaller definition for a custom type Person:

import spray.httpx.unmarshalling._
import spray.util._
import spray.http._

val `application/vnd.acme.person` =
  MediaTypes.register(MediaType.custom("application/vnd.acme.person"))

case class Person(name: String, firstName: String, age: Int)

object Person {
  implicit val PersonUnmarshaller =
    Unmarshaller[Person](`application/vnd.acme.person`) {
      case HttpEntity.NonEmpty(contentType, data) =>
        // unmarshal from the string format used in the marshaller example
        val Array(_, name, first, age) =
          data.asString.split(":,".toCharArray).map(_.trim)
        Person(name, first, age.toInt)

      // if we had meaningful semantics for the HttpEntity.Empty
      // we could add a case for the HttpEntity.Empty:
      // case HttpEntity.Empty => ...
    }
}

val body = HttpEntity(`application/vnd.acme.person`, "Person: Bob, Parr, 32")
body.as[Person] === Right(Person("Bob", "Parr", 32))

As can be seen in this example you best define the Unmarshaller for T in the companion object of T. This way your unmarshaller is always in-scope, without any import tax.

Deriving Unmarshallers

Unmarshaller.delegate

Sometimes you can save yourself some work by reusing existing Unmarshallers for your custom ones. The idea is to “wrap” an existing Unmarshaller with some logic to “re-target” it to your type.

In this regard “wrapping” a Unmarshaller can mean one or both of the following two things:

  • Transform the input HttpEntity before it reaches the wrapped Unmarshaller
  • Transform the output of the wrapped Unmarshaller

You can do both, but the existing support infrastructure favors the latter over the former. The Unmarshaller.delegate helper allows you to turn an Unmarshaller[A] into an Unmarshaller[B] by providing a function A => B:

def delegate[A, B](unmarshalFrom: ContentTypeRange*)
                  (f: A => B)
                  (implicit mb: Unmarshaller[A]): Unmarshaller[B]

For example, by using Unmarshaller.delegate the Unmarshaller[Person] from the example above could be simplified to this:

implicit val SimplerPersonUnmarshaller =
  Unmarshaller.delegate[String, Person](`application/vnd.acme.person`) { string =>
    val Array(_, name, first, age) = string.split(":,".toCharArray).map(_.trim)
    Person(name, first, age.toInt)
  }

Unmarshaller.forNonEmpty

In addition to Unmarshaller.delegate there is also another “deriving Unmarshaller builder” called Unmarshaller.forNonEmpty. It “modifies” an existing Unmarshaller to not accept empty entities.

For example, the default NodeSeqMarshaller (see above) accepts empty entities as a valid representation of NodeSeq.Empty. It might be, however, that in your application context empty entities are not allowed. In order to achieve this, instead of “overriding” the existing NodeSeqMarshaller with an all-custom re-implementation you could be doing this:

implicit val myNodeSeqUnmarshaller = Unmarshaller.forNonEmpty[NodeSeq]

HttpEntity(MediaTypes.`text/xml`, "<xml>yeah</xml>").as[NodeSeq] === Right(<xml>yeah</xml>)
HttpEntity.Empty.as[NodeSeq] === Left(ContentExpected)

More specific Unmarshallers

The plain Unmarshaller[T] is agnostic to whether it is used on the server- or on the client-side. This means that it can be used to deserialize the entities from requests as well as responses. Also, the only information that an Unmarshaller[T] has access to for its job is the message entity. Sometimes this is not enough.

FromMessageUnmarshaller

If you need access to the message headers during unmarshalling you can write an FromMessageUnmarshaller[T] for your type. It is defined as such:

type FromMessageUnmarshaller[T] = Deserializer[HttpMessage, T]

and allows access to all members of the HttpMessage superclass of the HttpRequest and HttpResponse types, most importantly: the message headers. Since, like the plain Unmarshaller[T], it can deserialize requests as well as responses it can be used on the server- as well as the client-side.

An in-scope FromMessageUnmarshaller[T] takes precedence before any potentially available plain Unmarshaller[T].

FromRequestUnmarshaller

The FromRequestUnmarshaller[T] is the most “powerful” unmarshaller that can be used on the server-side (and only there). It is defined like this:

type FromRequestUnmarshaller[T] = Deserializer[HttpRequest, T]

and allows access to all members of the incoming HttpRequest instance.

An in-scope FromRequestUnmarshaller[T] takes precedence before any potentially available FromMessageUnmarshaller[T] or plain Unmarshaller[T].

FromResponseUnmarshaller

The FromResponseUnmarshaller[T] is the most “powerful” unmarshaller that can be used on the client-side (and only there). It is defined like this:

type FromResponseUnmarshaller[T] = Deserializer[HttpResponse, T]

and allows access to all members of the incoming HttpResponse instance.

An in-scope FromResponseUnmarshaller[T] takes precedence before any potentially available FromMessageUnmarshaller[T] or plain Unmarshaller[T].