Skip to main content

Introduction

Welcome to bem. We're building the next generation of data transformation primitives so you don't have to. In this reference, you'll find a comprehensive list of all available endpoints, with their parameters and responses. Please give us feedback; that's how we build an amazing product.

API Base URL

Unless otherwise specified, all endpoints use https://api.bem.ai as their base URL.

API Keys

For all requests, you'll need an API key. Pass this in using an x-api-key header.

Webhook Authentication

To confirm authenticity of webhook requests coming from bem, we provide a bem-signature header on every outgoing request to the endpoint specified in your pipeline. The header value includes a timestamp (t=) and a signature (v1=); these values are comma-separated, and the scheme will be versioned in case of future updates.

bem-signature:
t=1492774577,
v1=0734be64d748aa8e8ee9dfe87407665541f2c33f9b0ebf19dfd0dd80f08f504c

Signatures are generated using HMAC with SHA-256. The webhook secret for your account can be generated, retrieved, and revoked through our API, and we use that secret to encode the payload into the signature we present in the header.

To verify the signature, you must complete the following steps:

Step 1: Extract timestamp and signature from header

Split the raw string to grab the respective t timestamp and v1 signature values.

Step 2: Prepare the signed payload string

The payload string is created by concatenating:

  • The timestamp (as a string)
  • The character .
  • The actual JSON payload (stringified request body)

Step 3: Determine the expected signature

Compute an HMAC with the SHA-256 hash function (the string output should be in hex). Use your account's webhook secret as the key, and the signed payload string as the message.

Step 4: Compare the signatures

Compare your computed signature with the signature provided in the header doing a simple string equality check. If the signatures match, you've validated that the request to your webhook endpoint is coming from bem.

Building an Output Schema

For some best practices and tips around how to effectively shape your output schema, you can take a look at our guide here.

Email inputs

On top of our Create Transformation endpoint below, every pipeline has an associated automatic @pipeline.bem.ai email address where you can forward emails. The email address input will also handle attachments with the same behavior as emails sent through our API, meaning you can send CSV, XLSX, XLX, and PDFs to be processed along with email body content. The referenceID we store for each email processed through the pipeline email address is the value of the Message-ID header included in the email.

Processing Collections

By default, our pipelines will do linear transformations over your inputs, meaning that one input data point will result in a single output object according to your schema. If you'd like to process a collection of data points (in that a single input will result in an array of outputs), set the independentDocumentProcessingEnabled boolean option on your respective pipeline with an output schema defining a single object. Your pipeline will then treat each individual row as a discrete entity, and each output transformation will be a batched array of objects according to your output schema. Each associated transformation will have an itemOffset field to help you map each object to its discrete row in your input data.

Order of operations

All jobs are asynchronous and therefore we don't guarantee we'll return transformed data points in the order you sent us. If you must keep track of the order, we recommend you generate internal time-sensitive KSUIDs as the referenceID for future sorting.

Pagination

Our pagination follows the same conventions as the Stripe API, allowing you to use cursors to page back-and-forth through results. Our API uses cursor-based pagination through startingAfter and endingBefore parameters. Both parameters accept an existing object ID value and return objects in chronological order. The endingBefore parameter returns objects listed before the given object. The startingAfter parameter returns objects listed after the given object. These parameters are mutually exclusive. You can use either the startingAfter or endingBefore parameter, but not both simultaneously. An limit parameter can be optionally provided to control the page size and our API defaults to a page size of 50 if a limit is not provided.