How to Efficiently Query your Bids Data
Last updated on July 23, 2020
More About Snowflake Credits
Snowflake credits are consumed for the compute costs of running queries against a Snowflake warehouse. The more complex a query is, the more computationally taxing it will be, and the more credits it will consume. Credits are consumed per second of active warehouse time (at least one query is running). Hourly credit consumption depends on your warehouse size. Naturally, larger warehouses return results more quickly, but spend more credits.
By default OpenX provides all customers with a “X-Small” warehouse, which consumes 1 credit for each hour of querying. With a 500 monthly credit limit, that is equivalent to ~500 hours of monthly query time. Warehouses can be resized, so OpenX can provide your account with a larger warehouse if you want your queries to return faster.
Warehouses can be set to auto-suspend after a certain amount of time with no queries running. The minimum for this time threshold is 60 seconds, meaning each query will incur at least 60 seconds of credit consumption.
Credit consumption is more crucial in a OpenX-hosted instance as OpenX imposes a monthly credit limit of 500.
Customers using integrated BIDS will be on their own organization’s Snowflake bill and should follow their own organization’s policies around credit consumption.
Tips for Running Efficient Queries
Best practices for efficiently querying BIDS data:
- Have a goal in mind: Doing “select
*”s can burn through your credits quickly and not provide you with enough insight.
- Have a data date range in mind: You don’t want to query any time ranges you do not require.
- Pull only the things you need: Find the column(s) you need and add filters so the dataset only contains what you need for your analysis.
- Check “History” of your past queries (in your worksheet): This will give you some good information of the queries you ran like execution duration, SQL executed, and start and end timestamps.
- Be careful with doing joins: Joining tables without providing a join condition (resulting in a Cartesian Product), or providing a condition where records from one table match multiple records from another table. In both cases it will explode your data processing and output.
- Be careful of doing unions: “
UNION” and “
UNION ALL” do slightly different things. “
UNION” concatenation and does duplication removal whereas “
UNION ALL” only does concatenation