Sep. 2nd, 2020

jbanana: Badly drawn banana (Default)
An SQL group by lets you find something like "How heavy is the heaviest of each type of animal?"
select type, max(weight) as maxweight
from animals
group by type
type  maxweight
----- ---------
Cat           9
Dog          21
Horse       999
But which particular animal is the biggest? I would have joined those results back to animals, and that works, but for the (enormous) table I was looking at today, that was very slow. Internet suggests losing the group by and joining. WTF?
select a.*
from animals a
  left join animals x on x.type = a.type and a.weight < x.weight
where x.weight is null
"Join all the animals with animals of the same type who weigh more, and the one that didn't join is the heaviest" Now there's a lot of joining going on there, most of which is thrown away, but it's alleged to be a sight quicker than my naive approach.

We'll see.

I made the animal data up. Can a horse weigh a tonne? No idea.

Edit: either I've done it wrong, or it's not spectacularly different. Plan C is "pseudo-group-by" with a windowing function:
select *
from (
    select
        *,
        row_number() over (partition by type order by weight desc) as rownum
    from animals
) x
where x.rownum = 1
Should work, not very readable.

Maybe this is just a hard problem for huge tables.

Edit 2: plan C worked. I cancelled both of the previous attempts after over 20 minutes with still no results. Plan C finished in 34 seconds. Ugly, but fast.

May 2025

M T W T F S S
   1234
5678 91011
12131415161718
19202122232425
262728293031 

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 7th, 2025 10:52 pm
Powered by Dreamwidth Studios