Crypto/Processing Streams

Ref: cryptoface

In the progress of designing a cryptography wrapper, I've found a tricky dichotomy in how libraries can handle processing data streams. From my findings, there's at least three different ways in which data can be managed:

In designing a wrapper that can handle implementations that implement code in many fashions, it's a tricky balancing act to figure out how to best work. One must consider performance and the importance of a clean but powerful API.

Chunk Processing

An example of basic chunk processing is that provided by mhash. You feed it chunks of data, until the end where you get your data out. This is very flexible, however it is quite a bit harder to implement when there are multiple transformations and/or when the number of input chunks does not necessarily match the number of output chunks.

Callbacks

An example of callback handling is with MS CAPI's CryptMsgOpenToDecode. The general workflow is that you setup a state machine and feed it data, when it has data it can output, it will call your callback function with data as it can. You are then responsible for copying the data out and putting it wherever it needs to be. This is a powerful option, however it makes rewrapping as chunk-processing a challenge.

This can readly wrap chunk processing at the cost of an extra location to store the output buffer (although this is most likely how native callback handler systems work out).

Stream Filter Abstractions

An example of stream abstractions in use are OpenSSL's PKCS7-sign and Crypto++'s hash-filter. In reality, these are just complex wrappers around a callback system... however they can provide clean theory and unification of both input/output handling.

Stream filters can readily wrap chunk processing in the same way that callbacks can... just take the input data from the chunk and pipe it forward.

Resolution

Since streams can usually be treated just like callbacks, the design can effectively consider them equivalent for the purpose of wrapping. To best deal with the different I/O mechanisms and deal with the differing libraries in a most efficient way, providing both APIs may be the best option. The underlying wrapper interface will be able to take advantage of the native interface, while differing APIs can take advantage of a smart common codebase to help manage the differences.

This "smart common codebase" would be code managing buffers/streaming/etc in order to deal with the situation where too much data is available for the chunk processing, but the callback/filter has already used up the data in providing the additional data.

If there's any other paradigms for filtering/processing data, please let me know, either by commenting or emailing me. I'll post an update when possible with more information, as I'm certain there'll be many interesting in a one-lib to crypt them all and make dealing with cryptography less complicated... since the library will take care of the nasty details and distilling them to a single unified interface.

By on