Anyone know of a better way to calculate the derivative of two series in SQL? I.e., given a table with incrementing counters does a better method of finding the rate of change in those counters exist? The next best solution I can think of has me using postgresql's scripting language. Example table and my current solution:

Table: `r` id | rx_bytes | tx_bytes | stamp ----+------------+------------+---------------------------- 1 | 4409175828 | 1953740054 | 2007-05-10 13:52:05.768994 2 | 4409317532 | 1953876624 | 2007-05-10 13:55:04.676586 3 | 4409586227 | 1954221384 | 2007-05-10 14:00:01.921733 4 | 4411577594 | 1954689574 | 2007-05-10 14:05:05.606004 5 | 4412308594 | 1955036986 | 2007-05-10 14:10:02.103433 6 | 4412563005 | 1955363345 | 2007-05-10 14:15:02.930761 7 | 4412821894 | 1955604425 | 2007-05-10 14:20:09.826853 8 | 4413091666 | 1955914285 | 2007-05-10 14:25:06.545572 9 | 4413338941 | 1956166222 | 2007-05-10 14:30:04.06891 10 | 4413602704 | 1956457986 | 2007-05-10 14:35:05.334229

select ( r2.rx_bytes - r1.rx_bytes ) / ( extract(epoch from r2.stamp) - extract(epoch from r1.stamp) ) from r as r1, r as r2 where r2.id = r1.id + 1;

Implemented in plpgsql, the refcursor would mean using FETCH syntax, so I chose to always return junk (0.0 or something else) for the first row. Also, I need to benchmark this on a large-ish table:

-- create language plpgsql; create or replace function first_derivative( _a bigint, _b bigint ) returns numeric as $ph$ declare srow record; old_a bigint; old_b bigint; ret numeric; BEGIN -- setup temp table if we don't have it available -- XXX refactor temp table name to variable if not exists ( select 1 from pg_class where relname = 'first_derivative_temptable' ) then create temporary table first_derivative_temptable (a bigint, b bigint); insert into first_derivative_temptable values ( _a , _b ); return 0.0; end if; -- get old values for srow in select * from first_derivative_temptable LOOP old_a = srow.a; old_b = srow.b; end loop; -- save derivative ret = ( _a - old_a ) / ( _b - old_b ); -- insert values into temp table update first_derivative_temptable set a = _a , b = _b; return ret; END; $ph$ language plpgsql; -- tests select first_derivative( rx_bytes, cast( extract(epoch from stamp) as bigint) ) from raq1 where rx_bytes is not null order by stamp limit 5; select * from first_derivative_temptable;


What do you dislike about your current solution? -- David W
Inner joins suck when tables get big. ;)
Is there a reason the values can't be computed when you do the inserts? -- David W
Well, I want the table to reflect the data captured as much as possible to ensure as much future usability as possible. Regardless, I like to push things in terms of configurability/usability. -- Patrick.
In that case I would write a stored procedure. -- David W
I think plpgsql can only return a refcursor to handle this, and that's what I'm currently working on.
Interesting approach. I would have probably implemented it as a procedure returning a recordset. I'm not sure which is better. -- David W
Doh, obviously my approach wouldn't work. The recordset would be too huge. -- David W
Note: The current implementation of RETURN NEXT for PL/pgSQL stores the entire result set before returning from the function, as discussed above. That means that if a PL/pgSQL function produces a very large result set, performance may be poor: data will be written to disk to avoid memory exhaustion, but the function itself will not return until the entire result set has been generated. -- from http://www.postgresql.org/docs/7.4/static/plpgsql-control-structures.html
sorry, same goes for 8.2 http://www.postgresql.org/docs/8.2/static/plpgsql-control-structures.html