Subqueries & CTEs
hypequery provides support for Common Table Expressions (CTEs) and raw SQL capabilities that allow you to create complex queries with subqueries.
Common Table Expressions (CTEs)
CTEs are temporary result sets that you can reference within a query. They help make complex queries more readable and maintainable.
Using QueryBuilder as a CTE
You can use another QueryBuilder instance as a CTE:
import { createQueryBuilder } from '@hypequery/core';
const builder = createQueryBuilder(schema);
// Create a subquery builder
const activeUsersSubquery = builder
.table('users')
.select(['id', 'user_name', 'email'])
.where('status', 'eq', 'active');
// Use it as a CTE in the main query
const results = await builder
.table('orders')
.withCTE('active_users', activeUsersSubquery)
.select([
'orders.id',
'orders.total',
'active_users.user_name'
])
.innerJoin('active_users', 'user_id', 'active_users.id')
.execute();
This will generate SQL similar to:
WITH active_users AS (
SELECT id, user_name, email
FROM users
WHERE status = 'active'
)
SELECT orders.id, orders.total, active_users.user_name
FROM orders
INNER JOIN active_users ON orders.user_id = active_users.id
Using Raw SQL as a CTE
For more complex subqueries, you can use raw SQL strings:
const results = await builder
.table('orders')
.withCTE(
'monthly_totals',
'SELECT user_id, toStartOfMonth(created_at) as month, SUM(total) as monthly_sum FROM orders GROUP BY user_id, month'
)
.select([
'orders.id',
'orders.created_at',
'monthly_totals.monthly_sum'
])
.innerJoin('monthly_totals', 'user_id', 'monthly_totals.user_id')
.where('orders.created_at', 'gte', 'monthly_totals.month')
.execute();
Multiple CTEs
You can chain multiple CTEs for complex analytics:
const results = await builder
.table('events')
.withCTE(
'daily_users',
'SELECT user_id, toDate(timestamp) as day, COUNT(*) as event_count FROM events GROUP BY user_id, day'
)
.withCTE(
'active_users',
'SELECT user_id, COUNT(DISTINCT day) as active_days FROM daily_users GROUP BY user_id HAVING active_days > 7'
)
.select(['events.*'])
.innerJoin('active_users', 'user_id', 'active_users.user_id')
.execute();
Raw SQL Expressions
For complex conditions that can’t be expressed using the fluent API, you can use raw SQL expressions:
HAVING Clauses
The raw()
method allows you to add custom conditions to the HAVING clause:
const results = await builder
.table('orders')
.select(['user_id'])
.sum('total', 'total_spent')
.groupBy(['user_id'])
.raw('SUM(total) > 1000')
.raw('COUNT(DISTINCT product_id) >= 3')
.execute();
This will generate:
SELECT user_id, SUM(total) AS total_spent
FROM orders
GROUP BY user_id
HAVING SUM(total) > 1000 AND COUNT(DISTINCT product_id) >= 3
Limitations and Workarounds
hypequery doesn’t directly support nested subqueries in WHERE clauses, but you can work around this with CTEs or raw SQL expressions:
Example: IN Subqueries
To achieve a query like:
SELECT * FROM products
WHERE category_id IN (SELECT id FROM categories WHERE active = 1)
Use a CTE approach:
// Create a CTE for active categories
const results = await builder
.table('products')
.withCTE(
'active_categories',
'SELECT id FROM categories WHERE active = 1'
)
.select(['products.*'])
.innerJoin('active_categories', 'category_id', 'active_categories.id')
.execute();
Example: Correlated Subqueries
For more complex scenarios like correlated subqueries, you may need to use your database client directly:
import { ClickHouseConnection } from '@hypequery/clickhouse';
// For complex queries that hypequery can't easily build
const client = ClickHouseConnection.getClient();
const result = await client.query({
query: `
SELECT
u.id as user_id,
u.user_name,
(
SELECT COUNT(*)
FROM orders o
WHERE o.user_id = u.id AND o.status = 'completed'
) as completed_orders
FROM users u
WHERE u.status = 'active'
`,
format: 'JSONEachRow'
});
const data = await result.json();
Use Cases
Analytics Queries
CTEs are particularly useful for analytics queries:
const results = await builder
.table('events')
.withCTE(
'daily_stats',
'SELECT toDate(timestamp) as day, COUNT(*) as events, COUNT(DISTINCT user_id) as users FROM events GROUP BY day'
)
.withCTE(
'weekly_stats',
'SELECT toStartOfWeek(day) as week, SUM(events) as events, SUM(users) as users FROM daily_stats GROUP BY week'
)
.select(['week', 'events', 'users'])
.from('weekly_stats')
.orderBy('week', 'DESC')
.limit(10)
.execute();
Hierarchical Data
For hierarchical data or recursive queries, use a raw SQL approach with CTEs:
const client = ClickHouseConnection.getClient();
const result = await client.query({
query: `
WITH RECURSIVE category_tree AS (
SELECT id, parent_id, name, 1 as level
FROM categories
WHERE parent_id IS NULL
UNION ALL
SELECT c.id, c.parent_id, c.name, ct.level + 1
FROM categories c
INNER JOIN category_tree ct ON c.parent_id = ct.id
)
SELECT * FROM category_tree ORDER BY level, name
`,
format: 'JSONEachRow'
});
const data = await result.json();
Best Practices
-
Use CTEs for readability: Break complex queries into logical parts with CTEs.
-
Optimize CTE reuse: ClickHouse may materialize CTEs if used multiple times in a query.
-
Consider performance: Complex subqueries can impact performance; test with representative data volumes.
-
Type safety: When using raw SQL, you may lose type safety. Cast results appropriately.