Pipelines

In Pipelines, you’ll take the building blocks of your predictions–your outcomes, personas, and cohorts–and create a prediction pipeline that you can plug directly into your data warehouse, cloud bucket, or, via managed deployment, your favorite martech software (ESP, CRM, etc). Once complete, your predictions will always be kept up-to-date and auto-delivered to the destination that you chose.

Getting started

Inside Pipelines, you'll find a list of your current pipelines if you have any, with columns for:

  • Population: the population being targeted for the pipeline's predictions.
  • Payload: the outcomes, cohorts, and/or persona sets included in the pipeline.
  • Deployments: the target destination of the deployment.
  • Status: whether the pipeline is ready, queued, or errored.

Creating a pipeline

  1. Select new pipeline in the upper right of the Pipelines list view.

Screenshot of an empty Pipelines view

  1. Next, select your payload for this pipeline. Your payload is a combination of your outcome, your persona sets, and other customer cohorts–all of which are used to customize your predictions.
  • Outcome: the business outcome you want for this prediction deployment, configured in Outcomes.
  • Persona set: the personas applied to this deployment. Each individual record on the deploying end of this pipeline will be assigned a persona.
  • Cohort: Specific groups of customers based on criteria such as events and attributes, used as membership indicators in the pipeline.
  • Prediction explanations: Ticking this checkbox will add prediction score explanations to your deployment. See Understanding score explainability for more info.

📘Membership indicators

🚧️Eligibility restrictions

  1. Once your payload is selected, choose a population to include. This is the group of people you want to target with your predictions, such as Leads in the case of a traditional acquisition campaign.

  2. After you've targeted a population, optionally exclude a population. Any cohorts selected in this section will not be targeted with your predictions. Screenshot of a new pipeline with a likely to buy outcome for BigQuery

  3. With your criteria selected, click save pipeline. A loading bar will appear with a message indicating that your pipeline is building. Pipelines will generally take a few hours to build, and you'll receive an email when it's ready for use.

  4. When the pipeline is finished building, it will be disabled. You'll need to add a deployment for you to enable it–check out how to add deployments in the next section.

🚧️Editing a pipeline

Adding deployments

Deployments are the method through which Faraday users plug their pipelines–and therefore their predictions–back into their stack. Deployments can be configured to send to Faraday-hosted CSV, as well as to data warehouses, cloud buckets, and your favorite ESPs, CRMs, etc. You'll find the deployment section within a pipeline, under the pipeline's definition.

To create a deployment:

  1. Click add in the appropriate selection under Deployment. Screenshot of a new pipeline with no deployment
  2. After clicking add, a window will open to provide specific options that enable you to tailor the deployment to your liking.

Choosing your deployment's representation format

  1. Select your data format:

    • Hashed (default): Best for deploying audiences to ad platforms. Data is hashed, and not human-readable.
    • Referenced: Best for merging data back into your stack. Uses a reference key defined in your dataset's advanced options to identify unique rows. If a reference key is not defined, this option is unavailable to select.
    • Identified: Best for direct mail and canvassing campaigns. Data is unhashed and human-readable.
    • Aggregated: Best for geotargeted ad campaigns. Select this to see the number of people in each payload element (outcome, persona, cohort) within the area of the geographic type you select. Screenshot of a new pipeline target
  2. Next, select whether you'd like machine friendly or human friendly column headers.

    • Machine friendly: Best for automated systems where consistent naming is relevant.
    • Human friendly: Best for convenient, easy-to-read interpretation. Using human friendly makes your column headers instantly recognizable for what they are by including the outcome name and prediction type. This can help make your predictions easier to identify when deploying to ESPs, CRMs, etc, where you'll want to quickly be able to see a contact's persona or propensity score on their contact card.
  3. Click next to move onto advanced settings.

Choosing your advanced settings & finalizing deployments

  1. Choose what, if any, of the advanced settings below that you'd like to set for this deployment.
  2. Click save to finish the deployment.
  3. When your deployment is complete, your pipeline will still be disabled by default, but you're now able to test your deployment with the test deployment button. Clicking test deployment will output 100 rows of your pipeline to the URL in the deployment.
  4. To keep your pipeline up to date automatically on a daily basis and enable the full results of your pipeline, click the enabled toggle in the upper right. It will display green when the pipeline is enabled, and your full results can be retrieved from the URL listed in the deployment or via the download button.

Filter

Filter enables you to filter by the persona sets, outcomes, and cohort memberships you selected for your pipeline's payload. You can select specific personas to target in the deployment, e.g. filtering by your Married Mary persona by selecting its personas set choosing the "equal to" operator, and selecting Married Mary, will only include Married Marys in the deployment. Filtering by an outcome allows you to target a percent range of rows by percentile or score, enabling you to focus on only the people that matter most to you.

  • Outcome percentile is a whole integer between 1 and 100 (inclusive), and refers to the percentile of the outcome score distribution. The number of individuals in each percentile varies; as a rough estimate, the top 10 score percentiles correspond to the 10% of the population. For example, entering greater than or equal to 81 would filter the top 20% of the population scored.
  • Outcome probability refers to the estimated probability of the outcome and is a decimal from 0 to 1. To correctly enter in a score include the decimal point. For example, a score of .5 would be entered as 0.5, and reflects a 50% probability that an individual will achieve the outcome.

📘Further reading: Faraday scoring

Limit

In Limit, you can specify whether or not you'd like to limit your results by a top count of rows or a bottom count of rows.

  • Only the top/bottom (count) enables an exact number of rows to export.

📘Additional limit info

📘Large pipelines

Structure

Under structure, you can rename and reorder columns. Renaming them can make it even more convenient when importing your data into your activation platform. For ad platform deployments like LinkedIn, Facebook, and Google Ads, selecting the appropriate option in the dropdown enables you organize the file in a way that's convenient for upload to that platform.

📘Column naming conventions

Connection-specific

In this last settings option, you'll see format for hosted CSV deployments, or settings specific to the connection if you're deploying back to your database.

📘Advanced settings

Understanding deployment columns

A deployment in Faraday will include various points of data about your customers. When creating a deployment, in the structure section, you can select pre-formatted outputs for various destinations like Facebook, LinkedIn, and Google Ads to save the time & effort of formatting it yourself. In hashed deployments, personally identifiable information (PII) will be replaced by a hash key.

Hashed, identified, and referenced deployments
Column nameDefinitionAdditional info
row_idFaraday's internal key.
person_first_nameFirst name of the individual.
person_last_nameLast name of the individual.
house_number_and_streetPhysical address of the individual.
cityCity the individual resides in.
stateState the individual resides in.
postcodePostcode/Zip code the individual resides in.
emailEmail address of the individual in Faraday's data.
fdy_persona_set_persona_idThe ID of the persona set in which this individual's persona exists (not the persona itself).
fdy_persona_set_persona_nameThe name of the persona that the individual belongs to.
fdy_outcome_propensity_scoreAbsolute score (scale of 0.0-1.0) of the individual based on the model used.Propensity scores below 0.5 indicate the predictive model is leaning toward the individual not achieving the outcome, and vice versa.
fdy_outcome_propensity_percentileRelative rank of the individual's score among all values.1=lowest, 100=highest. To get the top 1% of scores, you want percentiles 99–100.
fdy_outcome_propensity_probabilityAbsolute score (scale of 0.0-1.0) of the individual based on the model used.Probability scores indicate the likelihood of an individual to achieve the outcome. A score of 0.75 indicates they have a 75% chance of achieving it.
Aggregated deployments
Column nameDefinitionAdditional info
CountyThe aggregation level selected when creating the deployment. Can be county, metro, state, or zipcode.
MetroThe aggregation level selected when creating the deployment. Can be county, metro, state, or zipcode.
StateThe aggregation level selected when creating the deployment. Can be county, metro, state, or zipcode.
ZipcodeThe aggregation level selected when creating the deployment.Can be county, metro, state, or zipcode.
[count or avg]_fdy_outcome_propensity_scoreThe total number of people in the location based on selected deployment filters (count) or the average score of people in this aggregated location based on selected deployment filters (avg).Avg is the absolute score (scale of 0.0-1.0) of the individuals based on the model used. Propensity scores below 0.5 indicate the predictive model is leaning toward the individuals not achieving the outcome, and vice versa.
[count or avg]_fdy_outcome_propensity_percentileThe total number of people in the location based on selected deployment filters (count) or the average percentile of people in this aggregated location based on selected deployment filters (avg).Relative rank of the individual's score among all values. Average percentile: 1=lowest, 100=highest. To get the top 1% of scores, you want percentiles 99–100.
[count or avg]_fdy_outcome_propensity_probabilityThe total number of people in the location based on selected deployment filters (count) or the average probability of people in this aggregated location based on selected deployment filters (avg).Absolute score (scale of 0.0-1.0) of the individual based on the model used. Probability scores indicate the likelihood of an individual to achieve the outcome. A score of 0.75 indicates they have a 75% chance of achieving it.

Understanding score explainability

When adding a payload to your pipeline, you can tick the checkbox include prediction explanations to add score explainability your deployments. These explanations detail which traits had the highest impact in calculating each individual's predicted score.

Image of a Faraday score explainability output

Above, we see an example of CSV output from a pipeline. John is impacted by this outcome’s major factors, age and number of children, and leaves him with a low probability of converting.

Jane, on the other hand, exhibits a more unusual combination of traits that give her a much higher conversion probability for different reasons. For Jane, both her household income and millennial lifestyle saw her conversion probability higher than John because this business often sees conversions from people with those traits–even if they’re not the dominant traits.

Score explainability can help you understand–down to the individual level–what traits in your data are influencing how likely (or unlikely) individuals are of achieving your predictive outcomes.

📘Score explainability headers

Deleting a pipeline

To delete a pipeline, click the options menu (three dots) on the far right of the pipeline you'd like to delete, then click delete. If the pipeline is in use by other objects in Faraday, such as a deployment within the pipeline, the delete pipeline popup will indicate that you need to modify those in order to delete the pipeline. Once there are no other objects using this pipeline, you can safely delete it.

📘Deletion dependencies

Screenshot of deleting a pipeline