JDB/Aggregation/Prototype
< JDB | Aggregation
Jump to navigation
Jump to search
JDB: Layout Columns | Queries Σ | API: Structural Records | Client/Server | Implementation | Examples
Here's an Aggregation prototype based on notes above used in pre-release stages of JDB. By JDB release, aggregation is integrated into query syntax. This approach still can be used as a template for extending further features, such as expressions and functions.
First load jdb using a few utilities from strings.
require 'data/jdb strings' coinsert 'jdb'
We separate Select Clause parser as nested splitting over by, comma and alias.
NB. parse 'alias:func col,... by alias:col,...' NB. return lists: ra rf rc ba bc sel_parse=: 3 : 0 t=. 2{.' by ' splitstring y c=. (<',') splitstring^:(0<#@]) each t 'r b'=. (<':') (_2 {. splitstring)^:(0<#@]) every L:1 c 'ra re'=. <"1 |:r 'rf rc'=. <"1 |:(<' ') (_2 {. splitstring)^:(0<#@]) every re 'ba bc'=. <"1 |:b ra ; rf ; rc ; ba ,&< bc ) Note 'Test sel_parse' hd=. (;:'ra rf rc ba bc') ,: ] hd sel_parse 'a,b,c' hd sel_parse 'a,b,c by qq,zz' hd sel_parse 'q:a,b,x:c' hd sel_parse 'a,q:b,c by qq,z:zz' hd sel_parse 'sum a,q:avg b,min c by qq,z:zz' )
Parsing test results gives the idea:
hd sel_parse 'a,q:b,c by qq,z:zz' +-----+----+-------+----+-------+ |ra |rf |rc |ba |bc | +-----+----+-------+----+-------+ |++-++|++++|+-+-+-+|++-+|+--+--+| |||q|||||||||a|b|c||||z|||qq|zz|| |++-++|++++|+-+-+-+|++-+|+--+--+| +-----+----+-------+----+-------+ hd sel_parse 'sum a,q:avg b,min c by qq,z:zz' +-----+-------------+-------+----+-------+ |ra |rf |rc |ba |bc | +-----+-------------+-------+----+-------+ |++-++|+---+---+---+|+-+-+-+|++-+|+--+--+| |||q||||sum|avg|min|||a|b|c||||z|||qq|zz|| |++-++|+---+---+---+|+-+-+-+|++-+|+--+--+| +-----+-------------+-------+----+-------+
Next we define aggregator operations and a helper function.
The Reada verb follows the algorithm in Aggregation notes.
'`sum avg min max count first last'=: (+/)`(+/%#)`(<./)`(>./)`#`{.`{: ovr=: [^:(0 < #@[) each NB. x if nonempty overrides y Reada=: 4 : 0 if. 1=#y=. boxopen y do. Reads__x y return. end. 'tbl wh sl'=. 3{.y NB. where and select clauses if. -.' by ' +./@E. sl do. Reads__x y return. end. 'ra rf rc ba bc'=. sel_parse sl NB. parse select clause q=. (','joinstring bc,rc),' from ',tbl,(0<#wh)#' where ',wh s=. {:Read__x q NB. fetch filtered base columns c=. i.~|:i.~every (#bc){.s NB. classi-map from by-cols ff=. rf,~(#bc)${.`{. NB. aggregs with {. for by-cols r=. '' for_i. i.#bc,rc do. r=. r,< c ff@.i/. i {::s NB. apply aggregs over c end. ((ba ovr bc),ra ovr rc),:tocolumn r NB. columnar format ) Note 'sandp' buildsandp_jdb_ F=. jpath '~temp' d=. opensandp_jdb_ F 'j p s sp spj'=: GetTable__d 'j p s sp spj' Reads__d 'from j' Reads__d 'from p' Reads__d'from s' Reads__d '** from sp where s.city=p.city' Reads__d '** from spj where color=Blue' d Reada 'j';'';'job_cnt:count jname by city' d Reada 'p';'city<>Paris';'sum weight by color' d Reada 'p';'';'sum weight by color,city' d Reada 'p';'';'avg weight by city,color' d Reada 'p';'';'avg weight,pone:first pid by where:city,color' d Reada 's';'';'min_st:min status by city' d Reada 'sp';'';'sum weight,min status by supplier:s.city' d Reada 'sp';'';'sum weight by supp:s.city,part:p.city' d Reada 'spj';'';'avg_qty:avg qty by supp:sname,part:p.city' d Reada 'spj';'';'avg_qty:avg qty by part_col:p.color' resetall_jdb_'' )
Example output:
j Reada '*';'job_cnt:count jname by city' +------+-------+ |city |job_cnt| +------+-------+ |Paris |1 | |Rome |1 | |Athens|2 | |London|2 | |Oslo |1 | +------+-------+ p Reada 'city<>Paris';'sum weight by color' +-----+------+ |color|weight| +-----+------+ |Red |45 | |Blue |17 | +-----+------+ p Reada '*';'sum weight by color,city' +-----+------+------+ |color|city |weight| +-----+------+------+ |Red |London|45 | |Green|Paris |17 | |Blue |Oslo |17 | |Blue |Paris |12 | +-----+------+------+ p Reada '*';'avg weight by city,color' +------+-----+------+ |city |color|weight| +------+-----+------+ |London|Red |15 | |Paris |Green|17 | |Oslo |Blue |17 | |Paris |Blue |12 | +------+-----+------+ p Reada '*';'avg weight,pone:first pid by where:city,color' +------+-----+------+----+ |where |color|weight|pone| +------+-----+------+----+ |London|Red |15 |P1 | |Paris |Green|17 |P2 | |Oslo |Blue |17 |P3 | |Paris |Blue |12 |P5 | +------+-----+------+----+ s Reada '*';'min_st:min status by city' +------+------+ |city |min_st| +------+------+ |London|20 | |Paris |10 | |Athens|30 | +------+------+ sp Reada '*';'sum weight,min status by supplier:s.city' +--------+------+------+ |supplier|weight|status| +--------+------+------+ |London |134 |20 | |Paris | 46 |10 | +--------+------+------+ sp Reada '*';'sum weight by supp:s.city,part:p.city' +------+------+------+ |supp |part |weight| +------+------+------+ |London|London|59 | |London|Paris |58 | |London|Oslo |17 | |Paris |London|12 | |Paris |Paris |34 | +------+------+------+ spj Reada '*';'avg_qty:avg qty by supp:sname,part:p.city' +-----+------+-------+ |supp |part |avg_qty| +-----+------+-------+ |Smith|London| 450| |Jones|Oslo |442.857| |Jones|Paris | 100| |Blake|Oslo | 200| |Blake|London| 500| |Clark|London| 300| |Adams|Paris | 260| |Adams|London| 400| |Adams|Oslo | 200| +-----+------+-------+ spj Reada '*';'avg_qty:avg qty by part_col:p.color' +--------+-------+ |part_col|avg_qty| +--------+-------+ |Red | 400| |Blue |353.846| |Green | 150| +--------+-------+