NYCJUG/2019-05-14
Beginner's Regatta
Five Programming Problems Every Software Engineer Should be Able to Solve in Less than 1 Hour
[From this essay: https://www.shiftedup.com/2015/05/07/five-programming-problems-every-software-engineer-should-be-able-to-solve-in-less-than-1-hour. These links seem to be broken but here's one with solutions to these problems in Python: https://grison.me/2015/05/09/five-programming-problems-every-software-engineer-should-be-able-to-solve-in-less-than-1-hour/.]
Whenever I post a job request for a Software Engineer position, applications start trickling in really quick. What bothers me is that several applicants will invariably have no idea of what "programming" means.
Of course, they think otherwise.
I guess iit s fine to apply for a "Front-End Web Developer" position if the only thing you know is jQuery, but since when does "Software Engineer" mean that only HTML, JavaScript, and CSS skills are required?
(And I love those who can't shut up about XML, JSON, XSLT, SOAP, HTTP, REST, SSL, and 200 more acronyms, but can't differentiate between the int and float data types.)
Can You Actually Code Anything?
For my Software Engineer position I'm usually expecting you to be able to code something. I'm talking about real code here: I give you a problem, and you write a solution for it using any programming language you feel comfortable with.
Do you think you can actually do this?
Here is the deal: if you can't solve the following 5 problems in less than 1 hour, you may want to revisit your resume. You might be great at doing whatever you do today, but you need to stop calling yourself a "Software Engineer" (or Programmer, or Computer Science specialist, or even maybe "Developer".) Stop lying to yourself, and take some time to re-focus your priorities.
The 5 problems
(The following problems are ridiculously simple, but you'd be surprise to discover how many people struggle with them. To the point of not getting anything done at all. Seriously.)
[* Spoiler Alert * Except for the first one, which is left as an exercise for the aspiring Jedi, J solutions follow each problem statement.]
Problem 1
Write three functions that compute the sum of the numbers in a given list using a for-loop, a while-loop, and recursion.
No - those are bad ways to do this. [You don't say? Maybe Gaussian shortcut using APVs?]
Problem 2
Write a function that combines two lists by alternatingly taking elements. For example: given the two lists [a, b, c] and [1, 2, 3], the function should return [a, 1, b, 2, c, 3].
,'abc',.'123' a1b2c3
Problem 3
Write a function that computes the list of the first 100 Fibonacci numbers. By definition, the first two numbers in the Fibonacci sequence are 0 and 1, and each subsequent number is the sum of the previous two. As an example, here are the first 10 Fibonacci numbers: 0, 1, 1, 2, 3, 5, 8, 13, 21, and 34.
(] , [: +/ _2 {. ])^:100 ] 1x NB. Use extended precision. 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 28657 46368 75025 121393 196418 317811 514229 832040 1346269 2178309 3524578 5702887 9227465 14930352 24157817 39088169 63245986 102334155 165580141 267914296 433494437 701408733 1134903170 1836311903 2971215073 4807526976 7778742049 12586269025 20365011074 32951280099 53316291173 86267571272 139583862445 225851433717 365435296162 591286729879 956722026041 1548008755920 2504730781961 4052739537881 6557470319842 10610209857723 17167680177565 27777890035288 44945570212853 72723460248141 117669030460994 190392490709135 308061521170129 498454011879264 806515533049393 1304969544928657 2111485077978050 3416454622906707 5527939700884757 8944394323791464 14472334024676221 23416728348467685 37889062373143906 61305790721611591 99194853094755497 160500643816367088 259695496911122585 420196140727489673 679891637638612258 1100087778366101931 1779979416004714189 2880067194370816120 4660046610375530309 7540113804746346429 12200160415121876738 19740274219868223167 31940434634990099905 51680708854858323072 83621143489848422977 135301852344706746049 218922995834555169026 354224848179261915075 573147844013817084101
Problem 4
Write a function that, given a list of non-negative integers, arranges them such that they form the largest possible number. For example, given [50, 2, 1, 9], the largest formed number is 95021.
;\:~":&.>50 2 1 9 95021
[This sorts the character representations in descending order so the highest digits are the first ones, based on the idea that the largest number has the highest digits in the most significant place. Some other examples follow.]
;\:~":&.>12 33 14 66 77 7766331412 ]rnd=. 10?100 63 87 75 37 74 79 54 39 25 60 ;\:~":&.>rnd 87797574636054393725
[Since we do not convert back to numeric, this works for quite large values.]
]rnd=. 50?1000 622 324 907 63 386 338 95 601 285 926 299 417 687 837 792 465 605 190 732 199 654 925 409 619 891 888 716 996 477 946 818 624 142 211 221 293 691 839 728 559 482 875 429 217 729 972 132 479 169 495 ;\:~":&.>rnd 9969729594692692590789188887583983781879273272972871669168765463624622619605601559495482479477465429417409386338324299293285221217211199190169142132
[However, it gets the wrong answer with input of 52 5 3: see https://www.reddit.com/r/programming/comments/358tnp/five_programming_problems_every_software_engineer/.]
Update: Apparently this problem got a lot of people talking (although not as much as Problem 5 below.) You can click here to read my solution.
Problem 5
Write a program that outputs all possibilities to put + or - or nothing between the numbers 1, 2, ..., 9 (in this order) such that the result is always 100. For example: 1 + 2 + 34 – 5 + 67 – 8 + 9 = 100.
$allcombos=. ,{8$<'+- ' 6561 NB. 6561=3^8 because we have 8 groups of 3 symbols; 8=<:9 digits between which we insert verbs. >0{allcombos ++++++++ ]nn=. ;":&.>}.i.10 123456789 ,nn,.9{.>0{allcombos 1+2+3+4+5+6+7+8+9 $all=. ' '-.~&.>,&.>(<nn),.&.>9{.&.>allcombos NB. Remove spaces 6561 3{.all +-----------------+-----------------+----------------+ |1+2+3+4+5+6+7+8+9|1+2+3+4+5+6+7+8-9|1+2+3+4+5+6+7+89| +-----------------+-----------------+----------------+ ".&>all NB. Evaluate each formula 45 27 117 11 29 _61 108 90 810 _3 15 _75 31 13 103 _66 _48 _768 99 81 171 65 83 _7 702 684 6804 _15 3 _87 19 1 91 _78 _60 _780 33 15 105 _1 17 _73 96 78 798 _69 _51 _141 _35 _53 37 _672 _654 _6774 90 72 162 56 74 _16 153 135 855 42 60 _30 76 58 148 _21 _3 ... #~.".&>all 2339 NB. This many distinct answers 100+/ . =".&>all NB. # that = 100 11 >all#~100=".&>all NB. What they are. 1+2+3-4-5+6+78+9 1+2+34-5-67-8-9 1+23-4-5+6+78-9 1+23-4-56+7+8+9 12+3+4+5-6+7-89 12+3-4-5+67+8+9 12-3+4-5-6-7+89 123+4-5-67-89 123+45-67-8-9 123-4+5+6+7-8-9 123-45+67-89
Update: (Here is one solution to this problem in case you are curious: https://www.shiftedup.com/2015/05/08/solution-to-problem-5-and-some-other-thoughts-about-this-type-of-questions.)
import java.util.ArrayList; public class Main { private static int TARGET_SUM = 100; private static int[] VALUES = { 1, 2, 3, 4, 5, 6, 7, 8, 9 }; static ArrayList add(int digit, String sign, ArrayList branches) { for (int i = 0; i < branches.size(); i++) { branches.set(i, digit + sign + branches.get(i)); } return branches; } static ArrayList f(int sum, int number, int index) { int digit = Math.abs(number % 10); if (index >= VALUES.length) { if (sum == number) { ArrayList result = new ArrayList(); result.add(Integer.toString(digit)); return result; } else { return new ArrayList(); } } ArrayList branch1 = f(sum - number, VALUES[index], index + 1); ArrayList branch2 = f(sum - number, -VALUES[index], index + 1); int concatenatedNumber = number >= 0 ? 10 * number + VALUES[index] : 10 * number - VALUES[index]; ArrayList branch3 = f(sum, concatenatedNumber, index + 1); ArrayList results = new ArrayList(); results.addAll(add(digit, "+", branch1)); results.addAll(add(digit, "-", branch2)); results.addAll(add(digit, "", branch3)); return results; } public static void main(String[] args) { for (String string : f(TARGET_SUM, VALUES[0], 1)) { System.out.println(string); } } }
Show and Tell
Convolutional Neural Nets
A member of the J forum, Jon Hough, posted some code for implementing a two-dimensional convolutional neural net.
From: jonghough via Programming <programming@jsoftware.com> Date: Thu, Apr 18, 2019 at 12:44 AM Subject: Re: [Jprogramming] convolutional neural network [was simplifying im2col] To: <programming@jsoftware.com> You can test my convnet on the cifar 10 dataset (need to download it to your PC first, and get all values into memory by: ...
Main Code Flow for Hough's Convolutional Neural Net Code
I've slightly modified the example provided by Jon Hough to be more succinct, so my startup is like this:
1!:44 '\amisc\work\neuralnets\conv2D\' NB. Move to project area NB. (setq default-directory "\\amisc\\work\\neuralnets\\conv2D\\") NB. for emacs load 'E:\Users\DevonMcC\j64-807-user\projects\jlearn\init.ijs' load 'initData.ijs'
The "init" loaded from the "projects" area is purely the code from Hough. It loads a number of things not needed for this project, so, at Jon's suggestion, I've commented out the sections of that module to ignore sections irrelevant to CNN (convolutional neural nets). It loads these modules which mostly correspond to different layers in a CNN:
require jpath '~Projects/jlearn/adv/advnn.ijs' require jpath '~Projects/jlearn/adv/activation.ijs' require jpath '~Projects/jlearn/adv/lstm.ijs' require jpath '~Projects/jlearn/adv/basicrnn.ijs' require jpath '~Projects/jlearn/adv/poollayer.ijs' require jpath '~Projects/jlearn/adv/flattenlayer.ijs' require jpath '~Projects/jlearn/adv/conv2d.ijs' require jpath '~Projects/jlearn/adv/batchnorm.ijs' require jpath '~Projects/jlearn/adv/dropout.ijs'
There is also a "jserialize.ijs" module for writing the model to file, which I do not use as I modified my own WS that writes J nouns to file. The "initData.ijs" simplifies loading the CIFAR10 data used as an example for this implementation:
NB.* initData.ijs: initialize CIFAR10 training data. load BASEDSK,'/amisc/work/neuralNets/conv2D/CIFAR10.ijs' CIFAR10Path=. BASEDSK,'/amisc/work/neuralNets/cifar-10-batches-bin/' 'testLbls testData'=: CIFAR10Path readCIFAR10 'test_batch.bin' NB. Test labels and data testLbls=: #: 2^ testLbls NB. Label nums to 10-element Boolean vecs testData=: 255%~testData NB. Map 0-255 -> 0-1. training=. (<CIFAR10Path) readCIFAR10 &.> (<'.bin'),~&.>(<'data_batch_'),&.>":&.>>:i.5 $&.>'trainLbls trainData'=: ;&.> <"1 |:>training NB. Assign training labels and data. trainLbls=: ,/#:2^|: ,: trainLbls NB. Label nums to 10-element Boolean vecs trainData=: 255%~trainData NB. Map 0-255 -> 0-1. 4!:55 <'training'
This replaces and generalizes 39 lines in the original example.
Once the data is loaded, we bring in a local copy of the base CNN code, then initialize the example model:
load 'conv2D.ijs' load 'setupModel0.ijs' Added Conv2D. Network depth is 1. Added Activation. Network depth is 2. Added PoolLayer. Network depth is 3. Added Conv2D. Network depth is 4. Added Activation. Network depth is 5. Added Conv2D. Network depth is 6. Added Activation. Network depth is 7. Added Conv2D. Network depth is 8. Added Activation. Network depth is 9. Added PoolLayer. Network depth is 10. Added FlattenLayer. Network depth is 11. Added SimpleLayer. Network depth is 12. Added Activation. Network depth is 13. Added SimpleLayer. Network depth is 14. Added Activation. Network depth is 15. Added SimpleLayer. Network depth is 16. Added Activation. Network depth is 17. Added Dropout. Network depth is 18. Added SimpleLayer. Network depth is 19. Added Activation. Network depth is 20.
This creates a number of namespaces, the numbered ones below, reflecting both J's automatic numbering of anonymous namespaces and the object-oriented origin of this code:
0 1 10 11 12 13 14 15 16 17 18 19 2 20 21 22 23 24 25 26 27 28 29 3 30 31 32 4 5 6 7 8 9 Activation AdaGradSolver AdamSolver BaseMLP BasicRNNLayer BatchNorm BatchNorm2D ... jzlib pcsv pdsv pplatimg testcases z
This setup from "setupModel0.ijs" looks like this:
NB.* setupModel0.ijs: set up example model from Jon Hough. pipe=: (100;25;'softmax';1;'l2';1e_3) conew 'NNPipeline' LR=: 0.001 c1=: (( 40 3 3 3);1;'relu';'adam';LR;1) conew 'Conv2D' a1=: 'relu' conew 'Activation' p1=: 2 conew 'PoolLayer' c2=: ((50 40 3 3);2;'relu';'adam';LR;1) conew 'Conv2D' a2=: 'relu' conew 'Activation' c3=: ((70 50 4 4);1;'relu';'adam';LR;1) conew 'Conv2D' a3=: 'relu' conew 'Activation' c4=: ((80 70 3 3);1;'relu';'adam';LR;1) conew 'Conv2D' a4=: 'relu' conew 'Activation' p2=: 2 conew 'PoolLayer' fl=: 1 conew 'FlattenLayer' fc1=: (80;90;'relu';'adam';LR) conew 'SimpleLayer' a5=: 'relu' conew 'Activation' d1=: 0.8 conew 'Dropout' fc2=: (90;100 ;'relu';'adam';LR) conew 'SimpleLayer' a6=: 'relu' conew 'Activation' d2=: 0.8 conew 'Dropout' fc3=: (100;60;'relu';'adam';LR) conew 'SimpleLayer' a7=: 'relu' conew 'Activation' d3=: 0.8 conew 'Dropout' fc4=: (60;10;'softmax';'adam';LR) conew 'SimpleLayer' a8=: 'softmax' conew 'Activation' addLayer__pipe c1 addLayer__pipe a1 addLayer__pipe p1 addLayer__pipe c2 addLayer__pipe a2 addLayer__pipe c3 addLayer__pipe a3 addLayer__pipe c4 addLayer__pipe a4 addLayer__pipe p2 addLayer__pipe fl addLayer__pipe fc1 addLayer__pipe a5 addLayer__pipe fc2 addLayer__pipe a6 addLayer__pipe fc3 addLayer__pipe a7 addLayer__pipe d3 addLayer__pipe fc4 addLayer__pipe a8 checkPredictions=: 4 : '+/ (y{>0{x) -:"1 1 (=>./)"1 >{:predict__pipe y{>1{x' NB.EG rr=. (<testLbls;<testData) checkPredictions &> (<i.100)+&.>100*i.<.100%~#testLbls NB. Check all tests in batches of 100.
I keep track of the layers used in the model because I want to work across the namespaces:
COVALS__pipe=: pipe,layers__pipe CONMS__pipe=: 'pipe';'c1';'a1';'p1';'c2';'a2';'c3';'a3';'c4';'a4';'p2';'fl';'fc1';'a5';'fc2';'a6';'fc3';'a7';'d3';'fc4';'a8' COVALS__pipe,:CONMS__pipe +----+--+--+--+--+--+--+--+--+--+--+--+---+--+---+--+---+--+--+---+--+ |0 |1 |3 |4 |5 |7 |8 |10|11|13|14|15|16 |18|20 |22|24 |26|27|28 |30| +----+--+--+--+--+--+--+--+--+--+--+--+---+--+---+--+---+--+--+---+--+ |pipe|c1|a1|p1|c2|a2|c3|a3|c4|a4|p2|fl|fc1|a5|fc2|a6|fc3|a7|d3|fc4|a8| +----+--+--+--+--+--+--+--+--+--+--+--+---+--+---+--+---+--+--+---+--+
We are ready to start training the model but first let's check the predictive power of the untrained model:
rr=. (<testLbls;<testData) checkPredictions &> (<i.100)+&.>100*i.<.100%~#testLbls load 'mystats' usus rr N B. Min, max, mean, standard deviation of predictions 4 20 10.7 3.08303
This tells us that, for the testing on groups of 100 items from the test set, the number of correct predictions ranged from 4 to 20 with a mean of 10.7 - basically the 10% correct we would expect for random classification of the photos into 10 categories. The length of training is controlled by a global "epochs" in the "pipe" namespace, so we may want to set this to a lower number than the default of 25 in order to complete a round of testing more quickly. The global "bs" in the same namespace controls how large a block will be trained at once, so this affects the memory footprint of the training process. Some of the modifications I have made to the top-level verb are for better tracking timing as a training session is running. However, let us first look at the original code for this:
NB. Fits the training datapaoints to the training labels. NB. The algorithm will run for the given number of epochs (defined in NB. NNPipeline constructor) or until the results are sufficiently NB. accurate. NB. parameters: NB. y: training set. NB. x: training labels NB. note that x and y must have the same length (row count). fit=: 4 : 0 NB. test output l=: {: layers if. type__l -: 'SimpleLayer' do. 'Pipeline output must match last layer''s activation.' assert (tolower output)-: activation__l end. sz=. # y pe=: >. sz % bs ctr=: 0 for_j. i.#layers do. NB. Loop through the namespaces, running each l=: j{layers NB. "onBeginFit", which may differ between them. onBeginFit__l '' end. NB.preRun y{~ bs ?#y while. ctr <epochs do. ectr=: 0 while. ectr < pe do. ectr=: ectr+1 for_j. i.#layers do. l=: j{layers onBeginFit__l '' end. index=. bs ?# y NB. choose random row(s) X=. index { y NB. get the random sample(s) Y=. index { x NB. corresponding target value(s) Y fit1 X total=: total + 1 smoutput 'Iteration complete: ',(":ectr),', total: ',":total wd^:1 'msgs' end. ctr=: ctr+1 end. )
Here is how I updated this code to simplify it and add runtime timing output. There are also a few commented-out lines where I experimented with different ways of working through the training cases:
fit_0_=: 4 : 0 l=. {: layers if. type__l -: 'SimpleLayer' do. 'Pipeline output must match last layer''s activation.' assert (tolower output)-: activation__l end. pe=. 1 NB. >. (#y) % bs ctr=. 0 ".&.>(<'_ '''''),~&.>(<'onBeginFit_'),&.>layers NB.preRun y{~ bs ?#y elapsedTime=. 3$6!:1'' NB. Start, loop start, current # seconds while. ctr < epochs do. ectr=. 0 NB. while. ectr < pe do. for_ixs. (-bs)]\?~#y do. NB. Random, complete list elapsedTime=. (6!:1'') 1}elapsedTime ".&.>(<'_ '''''),~&.>(<'onBeginFit_'),&.>layers NB. ixs=. (bs*ectr)+i.bs NB. Next block sequentially NB. ixs=. bs?#y NB. Next block randomly (ixs{x) fit1 ixs{y NB. backwardPass (ixs{x) calcError >{: forwardPass ixs{y total=. >:total [ ectr=. >:ectr elapsedTime=. (6!:1'') 2}elapsedTime msg=. 'Iterations complete: ',(":ectr),', total: ',":total smoutput msg,;' time (total and per loop): ',":({:-}:) elapsedTime end. ctr=. >:ctr end. )
The major subroutine here is "fit1":
NB. --- fit1__pipe=: 4 : 0 o=. forwardPass y e=. x calcError >{: o backwardPass e )
The major subroutines here are "forwardPass", "calcError", and "backwardPass":
NB. --- forwardPass__pipe=: 3 : 0 in=. y outs=. '' for_j. i.#layers do. l=. j{layers out=. forward__l in outs=. outs,<out in=. out end. outs ) NB. --- calcError__pipe=: 4 : 0 sw=. '' for_i. i.#layers do. l=. i{layers sw=. sw, getWeights__l '' end. r=: bs %~ regCoefficient * regF sw e=. x errorf y;r;bs lastLoss=: loss bestLoss=: bestLoss updateError lastLoss lossKeep=: lossKeep,loss y-x ) backwardPass__pipe=: 3 : 0 delta=. y NB. output error for_j. |.i.#layers do. l=. j{layers delta=. backward__l delta NB. See "Different "backward" Verbs from Hough's CNN" in "backward versions.docx" end. '' )
The routines referenced in some of the above:
NB. -------------------- regCoefficient__pipe 0.001 regF__pipe 0.001&*@:regL2 regL2__pipe -:@:(+/)@:*: errorf__pipe outSoftmax
At this point we are pretty deep into the weeds, so it may make sense to take a step back to look at all the different verbs with the same names in the different namespaces:
Layer Type pipe 0 Container c1 1 Conv2D a1 3 Activation p1 4 PoolLayer c2 5 Conv2D a2 7 Activation c3 8 Conv2D a3 10 Activation c4 11 Conv2D a4 13 Activation p2 14 PoolLayer fl 15 FlattenLayer fc1 16 SimpleLayer a5 18 Activation fc2 20 SimpleLayer a6 22 Activation fc3 24 SimpleLayer a7 26 Activation d3 27 Dropout fc4 28 SimpleLayer a8 30 Activation onBeginFit_19_ : 3 : '''''' onBeginFit_21_ : 3 : (,'0') onBeginFit_22_ : 3 : (,'0') onBeginFit_23_ : 3 : '''''' onBeginFit_25_ : 3 : (,'0') onBeginFit_26_ : 3 : '''''' onBeginFit_28_ : 3 : (,'0') onBeginFit_29_ : 3 : '''''' onBeginFit_31_ : 3 : (,'0') onBeginFit_32_ : 3 : (,'0') onBeginFit_33_ : 3 : (,'0') onBeginFit_34_ : 3 : '''''' onBeginFit_36_ : 3 : (,'0') onBeginFit_38_ : 3 : '''''' onBeginFit_40_ : 3 : (,'0') onBeginFit_42_ : 3 : '''''' onBeginFit_44_ : 3 : (,'0') onBeginFit_45_ : 3 : ' forward=: 3 : ''forwardPassFit y''' onBeginFit_46_ : 3 : '''''' onBeginFit_48_ : 3 : (,'0') Weights getWeights_19_ : 3 : ',>filter' getWeights_21_ : 3 : (,'0') getWeights_22_ : 3 : (,'0') getWeights_23_ : 3 : ',>filter' getWeights_25_ : 3 : (,'0') getWeights_26_ : 3 : ',>filter' getWeights_28_ : 3 : (,'0') getWeights_29_ : 3 : ',>filter' getWeights_31_ : 3 : (,'0') getWeights_32_ : 3 : (,'0') getWeights_33_ : 3 : (,'0') getWeights_34_ : 3 : ',w' getWeights_36_ : 3 : (,'0') getWeights_38_ : 3 : ',w' getWeights_40_ : 3 : (,'0') getWeights_42_ : 3 : ',w' getWeights_44_ : 3 : (,'0') getWeights_45_ : 3 : (,'0') getWeights_46_ : 3 : ',w' getWeights_48_ : 3 : (,'0') $filter_19_ 40 3 3 3 $filter_23_ 50 40 3 3 $filter_29_ 80 70 3 3 $w_34_ 81 90 $w_38_ 91 100 $w_42_ 101 60 viewmat w_34_
Training the Model
We start the training like this:
6!:2 'trainLbls fit__pipe trainData' Iterations complete: 1, total: 1 time (total and per loop): 11.997 11.993 Iterations complete: 2, total: 2 time (total and per loop): 24.054 12.057 Iterations complete: 3, total: 3 time (total and per loop): 35.869 11.815 Iterations complete: 4, total: 4 time (total and per loop): 48.003 12.134 Iterations complete: 5, total: 5 time (total and per loop): 60.393 12.39 ... Iterations complete: 499, total: 499 time (total and per loop): 6042.65 11.95 Iterations complete: 500, total: 500 time (total and per loop): 6054.61 11.955 6053.97 0 60 60#:6053.97 NB. This takes about an hour and 40 minutes 1 40 53.97
Checking how well this much training does:
6!:2 'rr__pipe=. (<testLbls;<testData) checkPredictions &> (<i.100)+&.>100*i.<.100%~#testLbls' 291.089 load 'mystats' usus rr__pipe 12 34 22.44 4.45249
So we see that the average prediction is right about 22% of the time now.
Different "forward" Verbs in Hough's CNN
Here are the verbs for the forward pass through the convolutional layers.
NB. Forward pass through the layer. This function takes the NB. input tensor and convolves it with the filter, to give an output NB. tensor of same dimensionality as the input. NB. The output is passed to the activation function and returned. NB. Parameters: NB. 0: Input tensor, must be 4-dimensional, of shape: NB. batch-size x channel x width x height NB. Returns: NB. Output tensor of same dimensionality as input, but different NB. shape, depending on the convolution algorithm. forward_19_=: 3 : 0"_ i=: y 'Input data has incorrect shape. Must be 4-d.' assert 4=#$y r=: ks cf"_ 3 y n=: r NB. first forward pass. We need to build bias tensor. if. bias -: '' do. bias=: 2 %~ +: 0.5-~ ? ( }. $ n) $ 0 solver=: (<filter) setSolver tolower solverType e__solver=: alpha end. r=: r +"3 _ bias r ) NB. --- forward_21_=: 3 : 0 i=: y act i ) NB. --- forward_22_=: 3 : 0 NB. From "poollayer.ijs" i=: y t=: (poolSize&pool)"2 i ) NB. 0: poolSize, the width (and height) of the pooling grids NB. max pool pool=: 4 : 0 poolKernel=. 2 2 $ ,x pooled=: poolKernel kernelFunc ;.3 y pooled ) kernelFunc=: ((>./)@:,) NB. --- NB. forward_23_=: 3 : 0"_ NB. From "conv2d.ijs" NB. forward_25_=: 3 : 0 NB. From "activation.ijs" NB. forward_26_=: 3 : 0"_ NB. From "conv2d.ijs" NB. forward_28_=: 3 : 0 NB. From "activation.ijs" NB. forward_29_=: 3 : 0"_ NB. From "conv2d.ijs" NB. forward_31_=: 3 : 0 NB. From "activation.ijs" NB. --- NB. From "flattenlayer.ijs' forward_33_=: 3 : 0 shape=: $y ,/"2,/"3 y ) NB. --- forward_34_=: 3 : 0 NB. From "advnn.ijs, coclass 'SimpleLayer'" i=: y n=: (y,"1] 1) dot w ) dot_34_=: +/ .* NB. --- NB. forward_36_=: 3 : 0 NB. From "activation.ijs" NB. forward_38_=: 3 : 0 NB. From "advnn.ijs, coclass 'SimpleLayer'" NB. forward_40_=: 3 : 0 NB. From "activation.ijs" NB. forward_42_=: 3 : 0 NB. From "advnn.ijs, coclass 'SimpleLayer'" NB. forward_44_=: 3 : 0 NB. From "activation.ijs" NB. forward_45_=: 3 : 'forwardPassPredict y' NB. From "dropout.ijs' NB. forward_46_=: 3 : 0 NB. From "advnn.ijs, coclass 'SimpleLayer'" NB. forward_48_=: 3 : 0 NB. From "activation.ijs"
Different "backward" Verbs in Hough's CNN
Here are the verbs for the backward pass through the layers.
NB. (":&.>i.6),.'\'&(] }.~ [: >: i:~) &.> ~.whereDefined&.>'_',~&.>(<'backward_'),&.>layers__pipe NB. The above, edited: +-+--------------------------------+ |0|conv2d.ijs | +-+--------------------------------+ |1|activation.ijs | +-+--------------------------------+ |2|poollayer.ijs | +-+--------------------------------+ |3|flattenlayer.ijs | +-+--------------------------------+ |4|advnn.ijs | +-+--------------------------------+ |5|advnn.ijs, coclass 'SimpleLayer'| +-+--------------------------------+ |6|dropout.ijs | +-+--------------------------------+ localeLayerType=: 0 : 0 19 Conv2D 21 Activation 22 PoolLayer 23 Conv2D 25 Activation 26 Conv2D 28 Activation 29 Conv2D 31 Activation 32 PoolLayer 33 FlattenLayer 34 SimpleLayer 36 Activation 38 SimpleLayer 40 Activation 42 SimpleLayer 44 Activation 45 Dropout 46 SimpleLayer 48 Activation ) NB. == 0: conv2d.ijs: == NB. Backward pass through the layer. This function takes the next NB. ( next in the forward direction) layer's output delta, updates NB. the filter (weight) values, bias values, NB. calculates the next delta, which it then returns. NB. Arguments: 0: delta value from next layer. NB. Returns: the newly calculated delta value. backward=: 3 : 0 NB. EG backward_19_ ntd=: td=: y NB. first backprop the activation NB. Propagate the error backwards. NB. For each weight: NB. dError/dWeight = Sum dError/dOut * dOut/dWeight NB. calculate each multiplicand separately. NB. (1) - calculate dError/dOut NB. NB. For the transpose convolution, we need the NB. padding, zero-insertion, stride, and kernel size NB. see: https://arxiv.org/pdf/1603.07285.pdf (p. 23) fp=. 0 NB. assume forward padding is zero fs=. 0{,ks NB. forward stride fk=. 4{,ks NB. forward kernel size zi=. fs - 1 NB. zero insertion bk=. fk NB. kernel size is same as forward bz=. 1 NB. backwards kernel stride bp=. fk - fp + 1 NB. backwards padding kernel=: 2 3 $ bz, bz, bz, (1{$td), bk, bk NB. convolve prior tensor deltas with the forward filter NB. First, transform td to the appropriate shape, padding. ttd=: bp padAll"2 zi insertZeros td dOut=: kernel bcf"_ 3 ttd NB. delta to be returned to previous layer. NB. (2) - calculate dOut/dWeight NB. We must expand the tensor of deltas by (stride-1) where stride is NB. the stride number for the associated forward convolution. stride=. 0{,ks NB. first index is the stride. exdeltas=: (<:stride) insertZeros ntd NB. expanded next layer deltas. NB. Now we can convolve the prior outputs with NB. the exdeltas, where kernel size is the NB. same as the forward convolution. dW=: |:"3|:"4 (+/ % #) (exdeltas) deconv"3 2 i dW=: (clampLow, clampHigh) clampV"1 0 dW NB.dBias=. ((1&,@:$)$ ,) (+/ % #) ntd filter=: >(<filter) calcGrad__solver <dW bias=: bias - alpha * (+/ % #) ntd NB. finally return delta dOut ) NB. == 1: activation.ijs: == backward=: 3 : 0 NB.EG backward_21_ delta=.y if. (isLast = 0 )*. setIsLast = 1 do. delta=. (dact i) * delta elseif. setIsLast = 0 do. setIsLast=:1 if. next -: '' do. isLast=: 1 elseif. type__next -: 'BatchNorm' do. isLast=: 1 end. if. isLast = 0 do. delta=. (dact i) * delta end. end. delta ) NB. --- NB. backward_23_=: 3 : 0"_ NB. From "0: conv2d.ijs" NB. backward_25_=3 : 0 NB. From "1: activation.ijs" NB. backward_26_=3 : 0"_ NB. From "0: conv2d.ijs" NB. backward_28_=3 : 0 NB. From "1: activation.ijs" NB. backward_29_=: 3 : 0"_ NB. From "0: conv2d.ijs" NB. backward_31_=: 3 : 0 NB. From "1: activation.ijs" NB. == 2: poollayer.ijs: == backward=: 3 : 0 NB.EG backward_32_, from "2: poollayer.ijs" td=: y V=: i Z=: t U=: poolSize depoolExpand Z UV=: U=V dpe=: poolSize depoolExpand td UV * dpe NB. Previous layer's td ) NB. max pool pool=: 4 : 0 poolKernel=. 2 2 $ ,x pooled=: poolKernel kernelFunc ;.3 y pooled ) kernelFunc=: ((>./)@:,) NB. --- NB. backward_33_=: 3 : 'shape $,y' NB. From "3: flattenlayer.ijs" NB. ---- backward_34_=: 3 : 0 NB. From "5: advnn.ijs, coclass 'SimpleLayer'" delta=. y wg=. (|: i,"1[1) dot delta delta=. delta dot |: }: w wg=. wg % bs w=: > (<w) calcGrad__solver <wg delta ) NB. ---- NB. backward_36_=: 3 : 0 NB. From "1: activation.ijs" NB. backward_38_=: 3 : 0 NB. From "5: advnn.ijs, coclass 'SimpleLayer'" NB. backward_40_=: 3 : 0 NB. From "1: activation.ijs" NB. backward_42_=: 3 : 0 NB. From "5: advnn.ijs, coclass 'SimpleLayer'" NB. backward_44_=: 3 : 0 NB. From "1: activation.ijs" NB. backward_45_=: 3 : 'dt=. dropnet * y' NB. From "6: dropout.ijs" NB. backward_46_=: 3 : 0 NB. From "5: advnn.ijs, coclass 'SimpleLayer'" NB. backward_48_=: 3 : 0 NB. From "1: activation.ijs"
Summary Opinion
Though the code seems to work somewhat, it stalls at a fairly low rate of prediction.
I tried to work through what needed to be done to improve it but I found the object-oriented style very obfuscatory because it scatters things like parallel data structures and virtually identical verbs into numerous disparate namespaces.
Advanced Topics
Finding Locations of J Items
We look at some ways to find where a J item was defined.
NB.* whereis: a helpful verb from Mike Day for finding locations of items in J. whereis=: 3 : 0 u=.<'_' if. '_'={:y do. 'y l'=.([:,<)"0<;._2 y elseif. #n=.I.'__'E.y do. l=.<y}.~n+2 y=.<n{.y elseif. do. l=.cofind y=.<y end. l,.(4!:4"1([:<;)"1 y,.u,.l,.u){a:,~4!:3'' ) source=. 0 : 0 'Mike Day' via Programming <programming@jsoftware.com> to: programming@jsoftware.com date: Apr 9, 2019, 1:29 PM subject: Re: [Jprogramming] (no subject) ) egUse=: 0 : 0 whereis 'plot' NB. No result because I have not loaded "plot". load 'plot' whereis 'plot' +------+----------------------------------------------------------------+ |jzplot|c:\users\devon_mccormick\j64-807\addons\graphics\plot\jzplot.ijs| +------+----------------------------------------------------------------+ |z |c:\users\devon_mccormick\j64-807\addons\graphics\plot\plot.ijs | +------+----------------------------------------------------------------+ )
There's also this, courtesy of Dan Bron:
whereDefined 3 : '(4!:4{.;:y) {:: (4!:3''''),<''Source of definition not found for "'',''".'',~y' whereDefined 'whereDefined' C:\amisc\JSys\J7\DHMUtils7.ijs
Saving J Nouns in Namespaces to File
For the neural net code above, with its numerous arrays, it is handy to be able to save them to preserve the state to which a neural net has evolved. Here's one way of doing this, accounting for the names being scattered across multiple namespaces.
NB.* saveAllVars: save all vars in namespace x to dir specified. saveAllVars_WS_=: 4 : 0 'base' saveAllVars y : coclass x [ svcoc_WS_=. coname'' nmlst_WS_=. <;._1 ' ',dsp_base_,(names 0),"1 ' ' nmlst_WS_=. nmlst_WS_,&.><'_',x,'_' if. -.dirExists_base_ y do. NB. Wait for dir to 6!:3,1 [ rc=. 1!:5 <y end. NB. be created. nmFlnms=. (<y) fileVar_WS_&.>nmlst_WS_ NB. List files saved to. coclass svcoc_WS_ NB.EG fileNames=. 'someNamespace' saveAllVars '\temp\foo' )
Here's how we save a CNN model with nouns in various namespaces:
COVALS__pipe saveAllVars_WS_ &.> <svdir
Learning and Teaching J
Thoughts on Programming Language Design by an All-star Cast
At the first annual charity event conducted by Puget Sound Programming Python(PuPPy) last Tuesday, four legendary language creators came together to discuss the past and future of language design. This event was organized to raise funds for Computer Science for All (CSforALL), an organization which aims to make CS an integral part of the educational experience.
Among the panelists were the creators of some of the most popular programming languages:
The discussion was moderated by Carol Willing, who is currently a Steering Council member and developer for Project Jupyter. She is also a member of the inaugural Python Steering Council, a Python Software Foundation Fellow and former Director.
Key Principles of Language Design
The first question thrown at the panelists was, “What are the principles of language design?”
Guido van Rossum believes: Designing a programming language is very similar to the way JK Rowling writes her books, the Harry Potter series.
When asked how he says JK Rowling is a genius in the way that some details that she mentioned in her first Harry Potter book ended up playing an important plot point in part six and seven.
Explaining how this relates to language design he adds, “In language design often that’s exactly how things go”. When designing a language we start with committing to certain details like the keywords we want to use, the style of coding we want to follow, etc. But, whatever we decide on we are stuck with them and in the future, we need to find new ways to use those details, just like Rowling. “The craft of designing a language is, in one hand, picking your initial set of choices so that’s there are a lot of possible continuations of the story. The other half of the art of language design is going back to your story and inventing creative ways of continuing it in a way that you had not thought of,” he adds.
When James Gosling was asked how Java came into existence and what were the design principles he abided by, he simply said, “it didn’t come out of like a personal passion project or something. It was actually from trying to build a prototype.” James Gosling and his team were working on a project that involved understanding the domain of embedded systems. For this, they spoke to a lot of developers who built software for embedded systems to know how their process works.
This project had about a dozen people on it and Gosling was responsible for making things much easier from a programming language point of view. “It started out as kind of doing better C and then it got out of control that the rest of the project really ended up just providing the context”, he adds. In the end, the only thing out of that project survived was “Java”. It was basically designed to solve the problems of people who are living outside of data centers, people who are getting shredded by problems with networking, security, and reliability.
Larry Wall calls himself a “linguist” rather than a computer scientist. He wanted to create a language that was more like a natural language. Explaining through an example, he said, “Instead of putting people in a university campus and deciding where they go we’re just gonna see where people want to walk and then put shortcuts in all those places.” A basic principle behind creating Perl was to provide APIs to everything. It was aimed to be both a good text processing language linguistically but also a glue language.
Wall further shares that in the 90s the language was stabilizing, but it did have some issues. So, in the year 2000, the Perl team basically decided to break everything and came up with a whole new set of design principles. And, based on these principles Perl was redesigned into Perl 6. Some of these principles were picking the right default, conserve your brackets because even Unicode does not have enough brackets, don’t reinvent object orientation poorly, etc.
He adds, “A great deal of the redesign was to say okay what is the right peg to hang everything on? Is it object-oriented? Is it something in the lexical scope or in the larger scope? What does the right peg to hang each piece of information on and if we don’t have that peg how do we create it?”
Anders Hejlsberg shares that he follows a common principle in all the languages he has worked on and that is “there’s only one way to do a particular thing.” He believes that if a developer is provided with four different ways he may end up choosing the wrong path and realize it later in the development. According to Hejlsberg, this is why often developers end up creating something called “simplexity” which means taking something complex and wrapping a single wrapper on top it so that the complexity goes away.
Similar to the views of Guido van Rossum, he further adds that any decision that you make when designing a language you have to live with it. When designing a language you need to be very careful about reasoning over what “not” to introduce in the language. Often, people will come to you with their suggestions for updates, but you cannot really change the nature of the programming language. Though you cannot really change the basic nature of a language, you can definitely extend it through extensions. You essentially have two options, either stay true to the nature of the language or you develop a new one.
The Type System of Programming Languages
Guido van Rossum, when asked about the typing approach in Python, shared how it was when Python was first introduced. Earlier, int was not a class it was actually a little conversion function. If you wanted to convert a string to an integer you can do that with a built-in function. Later on, Guido realized that this was a mistake. “We had a bunch of those functions and we realized that we had made a mistake, we have given users classes that were different from the built-in object types.”
That’s where the Python team decided to reinvent the whole approach to types in Python and did a bunch of cleanups. So, they changed the function int into a designator for the class int. Now, calling the class means constructing an instance of the class.
James Gosling shared that his focus has always been performance and one factor for improving performance is the type system. It is really useful for things like building optimizing compilers and doing ahead of time correctness checking. Having the type system also helps in cases where you are targeting small footprint devices. “To do that kind of compaction you need every kind of hope that it gives you, every last drop of information and, the earlier you know it, the better job you do,” he adds.
Anders Hejlsberg looks at type systems as a tooling feature. Developers love their IDEs, they are accustomed to things like statement completion, refactoring, and code navigation. These features are enabled by the semantic knowledge of your code and this semantic knowledge is provided by a compiler with a type system. Hejlsberg believes that adding types can dramatically increase the productivity of developers, which is a counterintuitive thought.
“We think that dynamic languages were easier to approach because you’ve got rid of the types which was a bother all the time. It turns out that you can actually be more productive by adding types if you do it in a non-intrusive manner and if you work hard on doing good type inference and so forth,” he adds.
Talking about the type system in Perl, Wall started off by saying that Perl 5 and Perl 6 had very different type systems. In Perl 5, everything was treated as a string even if it is a number or a floating point. The team wanted to keep this feature in Perl 6 as part of the redesign, but they realized that “it’s fine if the new user is confused about the interchangeability but it’s not so good if the computer is confused about things.” For Perl 6, Wall and his team envisioned to make it a better object-oriented as well as a better functional programming language. To achieve this goal, it is important to have a very sound type system of a sound meta object model underneath. And, you also need to take the slogans like “everything is an object, everything is a closure” very seriously.
What Makes a Programming Language Maintainable
Guido van Rossum believes that to make a programming language maintainable it is important to hit the right balance between the flexible and disciplined approach. While dynamic typing is great for small programs, large programs require a much-disciplined approach. And, it is better if the language itself enables that discipline rather than giving you the full freedom of doing whatever you want. This is why Guido is planning to add a very similar technology like TypeScript to Python. He adds: “TypeScript is actually incredibly useful and so we’re adding a very similar idea to Python. We are adding it in a slightly different way because we have a different context.”
Along with type system, refactoring engines can also prove to be very helpful. It will make it easier to perform large scale refactorings like millions of lines of code at once. Often, people do not rename methods because it is really hard to go over a piece of code and rename exactly this right variable. If you are provided with a refactoring engine, you just need to press a couple of buttons, type in the new name, and it will be refactored in maybe just 30 seconds.
The origin of the TypeScript project was these enormous JavaScript codebases. As these codebases became bigger and bigger, it became quite difficult to maintain them. These codebases basically became “write-only code” shared Anders Hejlsberg. He adds that this is why we need a semantic understanding of the code, which makes refactoring much easier. “This semantic understanding requires a type system to be in place and once you start adding that you add documentation to the code,” added Hejlsberg. Wall also supports the same thought that “good lexical scoping helps with refactoring”.
The Future of Programming Language Design
When asked about the future of programming design, James Gosling shared that a very underexplored area in programming is writing code for GPUs. He highlights the fact that currently, we do not have any programming language that works like a charm with GPUs and much work is needed to be done in that area.
Anders Hejlsberg rightly mentioned that programming languages do not move with the same speed as hardware or all the other technologies. In terms of evolution, programming languages are more like maths and the human brain. He said, “We’re still programming in languages that were invented 50 years ago, all of the principles of functional programming were thought of more than 50 years ago.”
But, he does believe that instead of segregating into separate categories like object-oriented or functional programming, now languages are becoming multi-paradigm. “Languages are becoming more multi-paradigm. I think it is wrong to talk about oh I only like object-oriented programming, or imperative programming, or functional programming language.”
Now, it is important to be aware of the recent researches, the new thinking, and the new paradigms. Then we need to incorporate them in our programming style, but tastefully.
Watch this talk conducted by PuPPy to know more in detail.