SQL: alternative to group by clause
Sep. 2nd, 2020 05:30 pmAn SQL
We'll see.
I made the animal data up. Can a horse weigh a tonne? No idea.
Edit: either I've done it wrong, or it's not spectacularly different. Plan C is "pseudo-group-by" with a windowing function:
Maybe this is just a hard problem for huge tables.
Edit 2: plan C worked. I cancelled both of the previous attempts after over 20 minutes with still no results. Plan C finished in 34 seconds. Ugly, but fast.
group by
lets you find something like "How heavy is the heaviest of each type of animal?" select type, max(weight) as maxweight from animals group by type
type maxweight ----- --------- Cat 9 Dog 21 Horse 999But which particular animal is the biggest? I would have joined those results back to
animals
, and that works, but for the (enormous) table I was looking at today, that was very slow. Internet suggests losing the group by
and joining. WTF? select a.* from animals a left join animals x on x.type = a.type and a.weight < x.weight where x.weight is null"Join all the animals with animals of the same type who weigh more, and the one that didn't join is the heaviest" Now there's a lot of joining going on there, most of which is thrown away, but it's alleged to be a sight quicker than my naive approach.
We'll see.
I made the animal data up. Can a horse weigh a tonne? No idea.
Edit: either I've done it wrong, or it's not spectacularly different. Plan C is "pseudo-group-by" with a windowing function:
select * from ( select *, row_number() over (partition by type order by weight desc) as rownum from animals ) x where x.rownum = 1Should work, not very readable.
Maybe this is just a hard problem for huge tables.
Edit 2: plan C worked. I cancelled both of the previous attempts after over 20 minutes with still no results. Plan C finished in 34 seconds. Ugly, but fast.