Sep. 2nd, 2020

jbanana: Badly drawn banana (Default)
An SQL group by lets you find something like "How heavy is the heaviest of each type of animal?"
select type, max(weight) as maxweight
from animals
group by type
type  maxweight
----- ---------
Cat           9
Dog          21
Horse       999
But which particular animal is the biggest? I would have joined those results back to animals, and that works, but for the (enormous) table I was looking at today, that was very slow. Internet suggests losing the group by and joining. WTF?
select a.*
from animals a
  left join animals x on x.type = a.type and a.weight < x.weight
where x.weight is null
"Join all the animals with animals of the same type who weigh more, and the one that didn't join is the heaviest" Now there's a lot of joining going on there, most of which is thrown away, but it's alleged to be a sight quicker than my naive approach.

We'll see.

I made the animal data up. Can a horse weigh a tonne? No idea.

Edit: either I've done it wrong, or it's not spectacularly different. Plan C is "pseudo-group-by" with a windowing function:
select *
from (
    select
        *,
        row_number() over (partition by type order by weight desc) as rownum
    from animals
) x
where x.rownum = 1
Should work, not very readable.

Maybe this is just a hard problem for huge tables.

Edit 2: plan C worked. I cancelled both of the previous attempts after over 20 minutes with still no results. Plan C finished in 34 seconds. Ugly, but fast.

August 2025

M T W T F S S
    123
45678910
11121314151617
1819202122 2324
25262728293031

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Nov. 13th, 2025 10:10 am
Powered by Dreamwidth Studios