Anyone know of a better way to calculate the derivative of two series in SQL? I.e., given a table with incrementing counters does a better method of finding the rate of change in those counters exist? The next best solution I can think of has me using postgresql's scripting language. Example table and my current solution:
Table: `r`
id | rx_bytes | tx_bytes | stamp
----+------------+------------+----------------------------
1 | 4409175828 | 1953740054 | 2007-05-10 13:52:05.768994
2 | 4409317532 | 1953876624 | 2007-05-10 13:55:04.676586
3 | 4409586227 | 1954221384 | 2007-05-10 14:00:01.921733
4 | 4411577594 | 1954689574 | 2007-05-10 14:05:05.606004
5 | 4412308594 | 1955036986 | 2007-05-10 14:10:02.103433
6 | 4412563005 | 1955363345 | 2007-05-10 14:15:02.930761
7 | 4412821894 | 1955604425 | 2007-05-10 14:20:09.826853
8 | 4413091666 | 1955914285 | 2007-05-10 14:25:06.545572
9 | 4413338941 | 1956166222 | 2007-05-10 14:30:04.06891
10 | 4413602704 | 1956457986 | 2007-05-10 14:35:05.334229
select
( r2.rx_bytes - r1.rx_bytes )
/
( extract(epoch from r2.stamp) - extract(epoch from r1.stamp) )
from r as r1, r as r2
where r2.id = r1.id + 1;
Implemented in plpgsql, the refcursor would mean using FETCH syntax, so I chose to always return junk (0.0 or something else) for the first row. Also, I need to benchmark this on a large-ish table:
-- create language plpgsql;
create or replace function first_derivative( _a bigint, _b bigint ) returns numeric as $ph$
declare
srow record;
old_a bigint;
old_b bigint;
ret numeric;
BEGIN
-- setup temp table if we don't have it available
-- XXX refactor temp table name to variable
if not exists ( select 1 from pg_class where relname = 'first_derivative_temptable' ) then
create temporary table first_derivative_temptable (a bigint, b bigint);
insert into first_derivative_temptable values ( _a , _b );
return 0.0;
end if;
-- get old values
for srow in select * from first_derivative_temptable LOOP
old_a = srow.a;
old_b = srow.b;
end loop;
-- save derivative
ret = ( _a - old_a ) / ( _b - old_b );
-- insert values into temp table
update first_derivative_temptable set a = _a , b = _b;
return ret;
END;
$ph$ language plpgsql;
-- tests
select first_derivative( rx_bytes, cast( extract(epoch from stamp) as bigint) )
from raq1 where rx_bytes is not null order by stamp limit 5;
select * from first_derivative_temptable;
What do you dislike about your current solution? -- David W
Inner joins suck when tables get big. ;)
Is there a reason the values can't be computed when you do the inserts? -- David W
Well, I want the table to reflect the data captured as much as possible to ensure as much future usability as possible. Regardless, I like to push things in terms of configurability/usability. -- Patrick.
In that case I would write a stored procedure. -- David W
I think plpgsql can only return a refcursor to handle this, and that's what I'm currently working on.
Interesting approach. I would have probably implemented it as a procedure returning a recordset. I'm not sure which is better. -- David W
Doh, obviously my approach wouldn't work. The recordset would be too huge. -- David W
Note: The current implementation of RETURN NEXT for PL/pgSQL stores the entire result set before returning from the function, as discussed above. That means that if a PL/pgSQL function produces a very large result set, performance may be poor: data will be written to disk to avoid memory exhaustion, but the function itself will not return until the entire result set has been generated.
-- from http://www.postgresql.org/docs/7.4/static/plpgsql-control-structures.html
sorry, same goes for 8.2
http://www.postgresql.org/docs/8.2/static/plpgsql-control-structures.html