Code comparison: Preql, SQL and the rest

This document was written with the aid of Pandas’ comparison with SQL.

Use the checkboxes to hide/show the code examples for each language.

Table Operations

Selecting columns

Column selection is done using the projection operator, {}.

tips{total_bill, tip, smoker, time}

The table name (tips) comes first, so that Preql can automatically suggest the field names.

In SQL, selection is done using the SELECT statement

SELECT total_bill, tip, smoker, time FROM tips;
tips[['total_bill', 'tip', 'smoker', 'time']]

Filtering rows

Row filtering is done using the filter operator, []:

tips[size >= 5 or total_bill > 45]
SELECT * FROM tips WHERE size >= 5 OR total_bill > 45;

DataFrames can be filtered in multiple ways; Pandas suggest using boolean indexing:

tips[(tips['size'] >= 5) | (tips['total_bill'] > 45)]
from sqlalchemy import or_
session.query(Tips).filter(or_(Tips.size >= 5, Tips.total_bill > 45))

Group by / Aggregation

In this example, we calculate how the amount of tips differs by day of the week.

Preql extends the projection operator to allow aggregation using the => construct:

tips{day => avg(tip), count()}

Conceptually, everything on the left of => are the keys, and on the right are the aggregated values.

SELECT day, AVG(tip), COUNT(*) FROM tips GROUP BY day;
tips.groupby('day').agg({'tip': np.mean, 'day': np.size})
from sqlalchemy import func
session.query(Tips.day, func.avg(Tips.tip), func.count(Tips.id)).group_by(Tips.day).all()

Concat, Union

In this example, we will concatenate and union two tables together.

table1 + table2	// concat
table1 | table2	// union
SELECT * FROM table1 UNION ALL SELECT * FROM table2;  -- concat
SELECT * FROM table1 UNION SELECT * FROM table2;      -- union
pd.concat([table1, table2])                      # concat
pd.concat([table1, table2]).drop_duplicates()    # union
union_all(session.query(table1), session.query(table2))      # concat
union(session.query(table1), session.query(table2))          # union

Top n rows with offset (limit)

tips[5..15]
// OR
tips[5..][..10]
SELECT * FROM tips ORDER BY tip DESC LIMIT 10 OFFSET 5;
tips.nlargest(10 + 5).tail(10)

Join

Join is essentially an operation that matches rows between two tables, based on common attributes.

join(a: table1.key1, b: table2.key2)

The result is a table with two columns, a and b, which are structs that each contain the columns of their respective table.

If we have pre-defined a “default join” between tables, we can shorten it to:

join(a: table1, b: table2)

Preql also offers the functions leftjoin(), outerjoin(), and joinall().

SELECT * FROM table1 INNER JOIN table2 ON table1.key1 = table2.key2;
pd.merge(df1, df2, on='key')

(it gets complicated if the key isn’t with the same name)

session.query(Table1).join(Tables2).filter(Table1.key1 == Table2.key2)

Insert row

Insert a row to the table, and specifying the columns by name.

new Country(name: "Spain", language: "Spanish")
INSERT INTO Country (name, language) VALUES ("Spain", "Spanish")
countries = countries.append({'name':'Spain', 'language': 'Spanish'}, ignore_index=True)
session.add(Country(name='Spain', language='Spanish'))

Update rows

tips[tip < 2] update {tip: tip*2}

Preql puts the update keyword after the selection, so that when working interactively, you can first see which rows you’re about to update.

UPDATE tips SET tip = tip*2 WHERE tip < 2;
tips.loc[tips['tip'] < 2, 'tip'] *= 2

(takes a different form for complex operations)

Gotchas

Null checks

Comparisons to null behave like in Python.

tips[col2==null]

Preql also has a value called unknown, which behaves like SQL’s NULL.

Simple comparison to NULL using =, will always return NULL. For comparing to NULL, you must use the IS operator (the operator name changes between dialects).

SELECT * FROM tips WHERE col2 IS NULL;
tips[tips['col2'].isna()]

Programming

Defining a function, and calling it from the query

func add_one(x: int) = x + 1

my_table{ add_one(my_column) }

(Type annotations validate the values at compile-time)

(Postgres dialect)

CREATE FUNCTION add_one(x int)
RETURNS int
AS
$$
 SELECT x + 1
$$
LANGUAGE SQL IMMUTABLE STRICT;

SELECT add_one(my_column) FROM my_table;
def add_one(x: int):
    return x + 1

my_table['my_column'].apply(add_one)

Impossible?

Counting a table from Python

This example demonstrates Preql’s Python API.

All examples set row_count to an integer value.

(assumes p is a preql instance)

row_count = len(p.my_table)

Or:

row_count = p('count(my_table)')
cur = conn.execute('SELECT COUNT() FROM my_table')
row_count = cur.fetchall()[0][0]
row_count = len(my_table.index)
row_count = session.query(my_table).count()