Technical

1 minute read

Environments

Anaconda

Creating an Anaconda Environment

Source

source activate myenv
python -m ipykernel install --user --name myenv --display-name "Python (myenv)"

Reviews

Data Bootcamp Tips (2017)

Snowflake

Parsing JSON with ANSCII SQL in Snowflake

LISTAGG Missing

If one were to use the LISTAGG operation like the following, and NULL values still exist in said list, they are included in the resulting operation

LISTAGG( [ DISTINCT ] <expr1> [, <delimiter> ]

However, if you add the resulting ORDER BY condition in the remainder of LISTAGG, it returns a list of non-NULL values with such delimiter

LISTAGG( [ DISTINCT ] <expr1> [, <delimiter> ] ) [ WITHIN GROUP ( <orderby_clause> ) ]

So, e.g.

SELECT LISTAGG(o_orderkey, ', ')
    FROM orders WHERE o_totalprice > 450000;

------------------------------------------------------+
             LISTAGG(O_ORDERKEY, ' ')                 |
------------------------------------------------------+
 41445, 55937, 67781, 80550, 95808, 101700, 103136, ''|
------------------------------------------------------+

The additional value at the end is due to a NULL value. So, to exclude it, add the remaining WITHIN GROUP to remove it in th results.

SELECT LISTAGG(o_orderkey, ‘, ‘) WITHIN GROUP (ORDER BY 1) AS orders FROM orders WHERE o_totalprice > 450000; ```

Migrating from one query engine and/or database to Snowflake