Pipelines
In Pipelines, you’ll take the building blocks of your predictions–your outcomes, personas, and cohorts–and create a prediction pipeline that you can plug directly into your data warehouse, cloud bucket, or, via managed deployment, your favorite martech software (ESP, CRM, etc). Once complete, your predictions will always be kept up-to-date and auto-delivered to the destination that you chose.
Getting started
Inside Pipelines, you'll find a list of your current pipelines if you have any, with columns for:
- Population: the population being targeted for the pipeline's predictions.
- Payload: the outcomes, cohorts, and/or persona sets included in the pipeline.
- Deployments: the target destination of the deployment.
- Status: whether the pipeline is ready, queued, or errored.
Creating a pipeline
- Select new pipeline in the upper right of the Pipelines list view.
- Next, select your payload for this pipeline. Your payload is a combination of your outcome, your persona sets, and other customer cohorts–all of which are used to customize your predictions.
- Outcome: the business outcome you want for this prediction deployment, configured in Outcomes.
- Persona set: the personas applied to this deployment. Each individual record on the deploying end of this pipeline will be assigned a persona.
- Cohort: Specific groups of customers based on criteria such as events and attributes, used as membership indicators in the pipeline.
📘Membership indicators
Membership indicators, or including cohorts in your payload, are useful for segmentation. For example, say you want to know who in your customer base has an income of greater than $100k. Your population to include for your pipeline would be your Customers cohort, and as part of your payload, you could then select a cohort of customers with an income of greater than $100k. As a result, anyone who is indicated as being in that $100k or greater cohort in your pipeline is a current customer with more than $100k in income.
🚧️Eligibility restrictions
If the outcome you select in your payload specified an eligibility cohort, your pipeline is not restricted by that same eligibility cohort. For example, if your outcome's eligibility cohort is Leads (e.g. in a lead scoring outcome), and your pipeline's population to include is Everyone, then everyone will be scored–not just your leads.
-
Once your payload is selected, choose a population to include. This is the group of people you want to target with your predictions, such as Leads in the case of a traditional acquisition campaign.
-
After you've targeted a population, optionally exclude a population. Any cohorts selected in this section will not be targeted with your predictions.
-
With your criteria selected, click save pipeline. A loading bar will appear with a message indicating that your pipeline is building. Pipelines will generally take a few hours to build, and you'll receive an email when it's ready for use.
-
When the pipeline is finished building, it will be disabled. You'll need to add a deployment for you to enable it–check out how to add deployments in the next section.
🚧️Editing a pipeline
If a pipeline is edited, it will return to preview mode and will need to be re-enabled afterward.
Adding deployments
Deployments are the method through which Faraday users plug their pipelines–and therefore their predictions–back into their stack. Deployments can be configured to send to Faraday-hosted CSV, as well as to data warehouses, cloud buckets, and your favorite ESPs, CRMs, etc. You'll find the deployment section within a pipeline, under the pipeline's definition.
To create a deployment:
- Click add in the appropriate selection under Deployment.
- After clicking add, a window will open to provide specific options that enable you to tailor the deployment to your liking.
Choosing your deployment's representation format
-
Select your data format:
- Hashed (default): Best for deploying audiences to ad platforms. Data is hashed, and not human-readable.
- Referenced: Best for merging data back into your stack. Uses a reference key defined in your dataset's advanced options to identify unique rows. If a reference key is not defined, this option is unavailable to select.
- Identified: Best for direct mail and canvassing campaigns. Data is unhashed and human-readable.
- Aggregated: Best for geotargeted ad campaigns. Select this to see the number of people in each payload element (outcome, persona, cohort) within the area of the geographic type you select.
-
Next, select whether you'd like machine friendly or human friendly column headers.
- Machine friendly: Best for automated systems where consistent naming is relevant.
- Human friendly: Best for convenient, easy-to-read interpretation. Using human friendly makes your column headers instantly recognizable for what they are by including the outcome name and prediction type. This can help make your predictions easier to identify when deploying to ESPs, CRMs, etc, where you'll want to quickly be able to see a contact's persona or propensity score on their contact card.
-
Click next to move onto advanced settings.
Choosing your advanced settings & finalizing deployments
- Choose what, if any, of the advanced settings below that you'd like to set for this deployment.
- Click save to finish the deployment.
- When your deployment is complete, your pipeline will still be disabled by default, but you're now able to test your deployment with the test deployment button. Clicking test deployment will output 100 rows of your pipeline to the URL in the deployment.
- To keep your pipeline up to date automatically on a daily basis and enable the full results of your pipeline, click the enabled toggle in the upper right. It will display green when the pipeline is enabled, and your full results can be retrieved from the URL listed in the deployment or via the download button.
Filter
Filter enables you to filter by the persona sets, outcomes, and cohort memberships you selected for your pipeline's payload. You can select specific personas to target in the deployment, e.g. filtering by your Married Mary persona by selecting its personas set choosing the "equal to" operator, and selecting Married Mary, will only include Married Marys in the deployment. Filtering by an outcome allows you to target a percent range of rows by percentile or score, enabling you to focus on only the people that matter most to you.
- Outcome percentile is a whole integer between 1 and 100 (inclusive), and refers to the percentile of the outcome score distribution. The number of individuals in each percentile varies; as a rough estimate, the top 10 score percentiles correspond to the 10% of the population. For example, entering greater than or equal to 81 would filter the top 20% of the population scored.
- Outcome probability refers to the estimated probability of the outcome and is a decimal from 0 to 1. To correctly enter in a score include the decimal point. For example, a score of .5 would be entered as 0.5, and reflects a 50% probability that an individual will achieve the outcome.
Limit
In Limit, you can specify whether or not you'd like to limit your results by a top count of rows or a bottom count of rows.
- Only the top/bottom (count) enables an exact number of rows to export.
📘Additional limit info
This limit refers only to rows and not necessarily to individuals. For hashed targets in particular, there are likely to be 2-3 duplicate rows per person (one per email and physical address).
📘Large pipelines
For larger pipeline sizes (20M+), the ordering is approximate and may not precisely represent the very top/bottom scoring individuals.
Structure
Under structure, you can rename and reorder columns. Renaming them can make it even more convenient when importing your data into your activation platform. For ad platform deployments like LinkedIn, Facebook, and Google Ads, selecting the appropriate option in the dropdown enables you organize the file in a way that's convenient for upload to that platform.
📘Column naming conventions
Column names don't allow spaces, so if you receive an error when saving, check that you don't have any spaces in renamed columns. Instead of "Faraday propensity score," try "faraday_propensity_score."
Connection-specific
In this last settings option, you'll see format for hosted CSV deployments, or settings specific to the connection if you're deploying back to your database.
📘Advanced settings
These connection-specific settings are only recommended for advanced users and can safely be ignored otherwise.
Understanding deployment columns
A deployment in Faraday will include various points of data about your customers. When creating a deployment, in the structure section, you can select pre-formatted outputs for various destinations like Facebook, LinkedIn, and Google Ads to save the time & effort of formatting it yourself. In hashed deployments, personally identifiable information (PII) will be replaced by a hash key.
Hashed, identified, and referenced deployments
Column name | Definition | Additional info |
---|---|---|
row_id | Faraday's internal key. | |
person_first_name | First name of the individual. | |
person_last_name | Last name of the individual. | |
house_number_and_street | Physical address of the individual. | |
city | City the individual resides in. | |
state | State the individual resides in. | |
postcode | Postcode/Zip code the individual resides in. | |
Email address of the individual in Faraday's data. | ||
fdy_persona_set_persona_id | The ID of the persona set in which this individual's persona exists (not the persona itself). | |
fdy_persona_set_persona_name | The name of the persona that the individual belongs to. | |
fdy_outcome_propensity_score | Absolute score (scale of 0.0-1.0) of the individual based on the model used. | Propensity scores below 0.5 indicate the predictive model is leaning toward the individual not achieving the outcome, and vice versa. |
fdy_outcome_propensity_percentile | Relative rank of the individual's score among all values. | 1=lowest, 100=highest. To get the top 1% of scores, you want percentiles 99–100. |
fdy_outcome_propensity_probability | Absolute score (scale of 0.0-1.0) of the individual based on the model used. | Probability scores indicate the likelihood of an individual to achieve the outcome. A score of 0.75 indicates they have a 75% chance of achieving it. |
Aggregated deployments
Column name | Definition | Additional info |
---|---|---|
County | The aggregation level selected when creating the deployment. Can be county, metro, state, or zipcode. | |
Metro | The aggregation level selected when creating the deployment. Can be county, metro, state, or zipcode. | |
State | The aggregation level selected when creating the deployment. Can be county, metro, state, or zipcode. | |
Zipcode | The aggregation level selected when creating the deployment. | Can be county, metro, state, or zipcode. |
[count or avg]_fdy_outcome_propensity_score | The total number of people in the location based on selected deployment filters (count) or the average score of people in this aggregated location based on selected deployment filters (avg). | Avg is the absolute score (scale of 0.0-1.0) of the individuals based on the model used. Propensity scores below 0.5 indicate the predictive model is leaning toward the individuals not achieving the outcome, and vice versa. |
[count or avg]_fdy_outcome_propensity_percentile | The total number of people in the location based on selected deployment filters (count) or the average percentile of people in this aggregated location based on selected deployment filters (avg). | Relative rank of the individual's score among all values. Average percentile: 1=lowest, 100=highest. To get the top 1% of scores, you want percentiles 99–100. |
[count or avg]_fdy_outcome_propensity_probability | The total number of people in the location based on selected deployment filters (count) or the average probability of people in this aggregated location based on selected deployment filters (avg). | Absolute score (scale of 0.0-1.0) of the individual based on the model used. Probability scores indicate the likelihood of an individual to achieve the outcome. A score of 0.75 indicates they have a 75% chance of achieving it. |
Deleting a pipeline
To delete a pipeline, click the options menu (three dots) on the far right of the pipeline you'd like to delete, then click delete. If the pipeline is in use by other objects in Faraday, such as a deployment within the pipeline, the delete pipeline popup will indicate that you need to modify those in order to delete the pipeline. Once there are no other objects using this pipeline, you can safely delete it.
📘Deletion dependencies
See the deletions documentation for the order of dependencies, or the order of deletion priority.