BigQuery
Create a connection between Faraday and Google BigQuery so that your data is always up to date to make predictions, and your predictions can seamlessly sync back to your warehouse.
In this tutorial, we'll show you how to:
- Connect your BigQuery account to Faraday using a connection.
Let's dive in.
- You'll need a Faraday account — signup is free!
Prerequisites
You'll need the following details to create your connection to BigQuery:
- Dataset name requiredtext
- Project ID requiredtext
Granting access
First, you'll need Faraday access to your BigQuery account.
BigQuery is a serverless data warehouse. Access is shared using Google Cloud IAM permissions. We suggest that you create a Faraday-only dataset to both send and receive data. Within this dataset, Faraday would have full read and write access. Alternatively, you can give Faraday access to certain tables in a shared dataset.
Which IAM account (or both) depends on use of Targets and/or Datasets:
- Datasets:
faraday-incoming@production-237317.iam.gserviceaccount.com
- Targets:
faraday-outgoing@production-237317.iam.gserviceaccount.com
- Give service account
BigQuery Job User
at the Project level - Give service account
BigQuery Data Owner
at the Dataset level
Faraday suggests that you use an unguessable string somewhere in the path to your data. This avoids what is called the Confused deputy problem
For example, let's say you were using S3. Instead of naming an S3 bucket s3://faraday-acme/
,
name it s3://faraday-acme-pwiiprz162ez
. This guarantees that
malicious actors cannot guess the name and request that Faraday import data
from it into their account. The same logic applies to any path that is used to
locate data.
Connecting
Use a POST /connections
request:
curl https://api.faraday.ai/connections --json '{ "name": "BigQuery", "options": { "type": "bigquery", "dataset_name": "...", "project_id": "..." } }'
- Wait briefly while Faraday establishes your connection. It shouldn't take long.
Your new connection is now ready to use.