DISQUS

Eddie Awad’s Blog: Keep first, Keep last

  • Anthony Wilson · 3 years ago

    The two example queries will not return the same results in general.


    If there are 2 values of n in the table with equal maximum frequencies (i.e. two modes), then the first query will return both top ranked values. The second query will return the maximum of the two top-ranked values.


    Also, see:


    select stats_mode(n) from t;


    ... which will return strictly one of the modes, although we can't predict which one.

  • Eddie Awad · 3 years ago

    Anthony, you're right. That's an important distinction. So, to test, I added the following two rows to t:


    <pre>
    insert into t (n) values (400)
    /
    insert into t (n) values (400)
    /
    </pre>

    Then, I ran the first query:


    <pre>
    HR@XE> select n
    2 from
    3 (select n,
    4 dense_rank() over(
    5 order by cnt desc) as
    6 rnk
    7 from
    8 (select n,
    9 count(*) as
    10 cnt
    11 from t
    12 group by n)
    13 x)
    14 y
    15 where rnk = 1
    16 /
    </pre>

    I got:


    <pre>
    N
    ----------
    400
    300
    </pre>

    When I ran the second query:


    <pre>
    HR@XE> select max(n) keep(dense_rank first
    2 order by cnt desc) n
    3 from
    4 (select n,
    5 count(*) cnt
    6 from t
    7 group by n)
    8 /
    </pre>

    I got:


    <pre>
    N
    ----------
    400
    </pre>

    Which proves that KEEP...FIRST will only keep one value: the greatest number.


    Of course, you could change to first query to select the maximum:


    <pre>
    select max(n) from ( ...first query ...)
    </pre>

    which effectively makes it return the same result as the one with KEEP in this example.


    However, testing stats_mode:


    <pre>
    HR@XE> select stats_mode(n) from t;
    </pre>

    Returned:


    <pre>
    STATS_MODE(N)
    -------------
    300
    </pre>

    stats_mode does not work in 9i or below.


    Thanks for this clarification.