Code comparison: Preql, SQL and the rest¶
This document was written with the aid of Pandas’ comparison with SQL.
Use the checkboxes to hide/show the code examples for each language.
Table Operations¶
Selecting columns¶
Column selection is done using the projection operator, {}
.
tips{total_bill, tip, smoker, time}
The table name (tips
) comes first, so that Preql can automatically suggest the field names.
In SQL, selection is done using the SELECT statement
SELECT total_bill, tip, smoker, time FROM tips;
tips[['total_bill', 'tip', 'smoker', 'time']]
Filtering rows¶
Row filtering is done using the filter operator, []
:
tips[size >= 5 or total_bill > 45]
SELECT * FROM tips WHERE size >= 5 OR total_bill > 45;
DataFrames can be filtered in multiple ways; Pandas suggest using boolean indexing:
tips[(tips['size'] >= 5) | (tips['total_bill'] > 45)]
from sqlalchemy import or_
session.query(Tips).filter(or_(Tips.size >= 5, Tips.total_bill > 45))
Group by / Aggregation¶
In this example, we calculate how the amount of tips differs by day of the week.
Preql extends the projection operator to allow aggregation using the =>
construct:
tips{day => avg(tip), count()}
Conceptually, everything on the left of =>
are the keys, and on the right are the aggregated values.
SELECT day, AVG(tip), COUNT(*) FROM tips GROUP BY day;
tips.groupby('day').agg({'tip': np.mean, 'day': np.size})
from sqlalchemy import func
session.query(Tips.day, func.avg(Tips.tip), func.count(Tips.id)).group_by(Tips.day).all()
Concat, Union¶
In this example, we will concatenate and union two tables together.
table1 + table2 // concat
table1 | table2 // union
SELECT * FROM table1 UNION ALL SELECT * FROM table2; -- concat
SELECT * FROM table1 UNION SELECT * FROM table2; -- union
pd.concat([table1, table2]) # concat
pd.concat([table1, table2]).drop_duplicates() # union
union_all(session.query(table1), session.query(table2)) # concat
union(session.query(table1), session.query(table2)) # union
Top n rows with offset (limit)¶
tips[5..15]
// OR
tips[5..][..10]
SELECT * FROM tips ORDER BY tip DESC LIMIT 10 OFFSET 5;
tips.nlargest(10 + 5).tail(10)
Join¶
Join is essentially an operation that matches rows between two tables, based on common attributes.
join(a: table1.key1, b: table2.key2)
The result is a table with two columns, a
and b
, which are structs that each contain the columns of their respective table.
If we have pre-defined a “default join” between tables, we can shorten it to:
join(a: table1, b: table2)
Preql also offers the functions leftjoin()
, outerjoin()
, and joinall()
.
SELECT * FROM table1 INNER JOIN table2 ON table1.key1 = table2.key2;
pd.merge(df1, df2, on='key')
(it gets complicated if the key isn’t with the same name)
session.query(Table1).join(Tables2).filter(Table1.key1 == Table2.key2)
Insert row¶
Insert a row to the table, and specifying the columns by name.
new Country(name: "Spain", language: "Spanish")
INSERT INTO Country (name, language) VALUES ("Spain", "Spanish")
countries = countries.append({'name':'Spain', 'language': 'Spanish'}, ignore_index=True)
session.add(Country(name='Spain', language='Spanish'))
Update rows¶
tips[tip < 2] update {tip: tip*2}
Preql puts the update
keyword after the selection, so that when working interactively, you can first see which rows you’re about to update.
UPDATE tips SET tip = tip*2 WHERE tip < 2;
tips.loc[tips['tip'] < 2, 'tip'] *= 2
(takes a different form for complex operations)
Gotchas¶
Null checks¶
Comparisons to null
behave like in Python.
tips[col2==null]
Preql also has a value called unknown
, which behaves like SQL’s NULL
.
Simple comparison to NULL
using =
, will always return NULL
. For comparing to NULL
, you must use the IS
operator (the operator name changes between dialects).
SELECT * FROM tips WHERE col2 IS NULL;
tips[tips['col2'].isna()]
Programming¶
Defining a function, and calling it from the query¶
func add_one(x: int) = x + 1
my_table{ add_one(my_column) }
(Type annotations validate the values at compile-time)
(Postgres dialect)
CREATE FUNCTION add_one(x int)
RETURNS int
AS
$$
SELECT x + 1
$$
LANGUAGE SQL IMMUTABLE STRICT;
SELECT add_one(my_column) FROM my_table;
def add_one(x: int):
return x + 1
my_table['my_column'].apply(add_one)
Impossible?
Counting a table from Python¶
This example demonstrates Preql’s Python API.
All examples set row_count
to an integer value.
(assumes p
is a preql instance)
row_count = len(p.my_table)
Or:
row_count = p('count(my_table)')
cur = conn.execute('SELECT COUNT() FROM my_table')
row_count = cur.fetchall()[0][0]
row_count = len(my_table.index)
row_count = session.query(my_table).count()