jbanana | Sep. 2nd, 2020

An SQL group by lets you find something like "How heavy is the heaviest of each type of animal?"

select type, max(weight) as maxweight
from animals
group by type

type  maxweight
----- ---------
Cat           9
Dog          21
Horse       999

But which particular animal is the biggest? I would have joined those results back to animals, and that works, but for the (enormous) table I was looking at today, that was very slow. Internet suggests losing the group by and joining. WTF?

select a.*
from animals a
  left join animals x on x.type = a.type and a.weight < x.weight
where x.weight is null

"Join all the animals with animals of the same type who weigh more, and the one that didn't join is the heaviest" Now there's a lot of joining going on there, most of which is thrown away, but it's alleged to be a sight quicker than my naive approach.

We'll see.

I made the animal data up. Can a horse weigh a tonne? No idea.

Edit: either I've done it wrong, or it's not spectacularly different. Plan C is "pseudo-group-by" with a windowing function:

select *
from (
    select
        *,
        row_number() over (partition by type order by weight desc) as rownum
    from animals
) x
where x.rownum = 1

Should work, not very readable.

Maybe this is just a hard problem for huge tables.

Edit 2: plan C worked. I cancelled both of the previous attempts after over 20 minutes with still no results. Plan C finished in 34 seconds. Ugly, but fast.

M	T	W	T	F	S	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

That bloke

Those words

Sep. 2nd, 2020

Sep. 2nd, 2020

SQL: alternative to group by clause

Profile

May 2025

Most Popular Tags

Page Summary

Active Entries

Style Credit

Expand Cut Tags