NYCJUG/2018-02-13
Praise makes good men better and bad men worse. - T. Fuller, Gnomologia: Adagies and Proverbs, 1732
Beginner's regatta
We look at the more general cousin of |: transpose which is dyadic transpose.
Understanding Dyadic Transpose
We start with a question about how to transpose an array differently than simple monadic transpose |: which simply reverses the axes of the array which is its right argument, e.g.
$ |: i .2 3 4 4 3 2 $ |: i. 9 5 1 4 1 3 3 1 4 1 5 9
What if, instead of the 4 3 2 result, we wanted it to be shaped 3 2 4?
from: 'Jon Hough' via Programming <programming@jsoftware.com> to: Programming Forum <programming@jsoftware.com> date: Wed, Jan 24, 2018 at 2:21 AM subject: [Jprogramming] Reshaping 3+d matrices
I have a 3d matrix of shape a b c say. But I want to reshape it to have shape b a c, while still keeping data positional integrity.
So doing (b,a,c) $, myMatrix
will not work, as it destroys the relative positional relations between items. e.g.
mat =: ? 3 10 20 $ 0 NB. shape 3x10x20
I can reshape it using
|:"2|:"3|:"2 mat
But this seems ugly, and based on past experience there is usually an obvious / simple / elegant / in-built way to do it. It seems like there is a nicer way to do it.
Any ideas?
Thanks,
Jon
from: Rob Hodgkinson <rhodgkin@me.com> via forums.jsoftware.com date: Wed, Jan 24, 2018 at 2:32 AM
Hi John, did you try dyadic transpose (x |: y) where x is a reposition vector of dimensions …
(1 0 2 |: mat) -: |:"2|:"3|:"2 mat 1
from: David Lambert <b49p23tivg@gmail.com> via forums.jsoftware.com date: Wed, Jan 24, 2018 at 8:54 AM
I'm so clueless about dyadic transpose I now go straight away to trying all 6 combinations of x for a rank 3 array, then choose the one which looks right.
"x|:y moves axesx to the tail end." Yes I'll study the NuVoc description now.
Documentation on Dyadic Transpose
Here we offers some notes on the J documentation for the dyadic version of this verb.
[From http://code.jsoftware.com/wiki/Vocabulary/barco#dyadic]
x |: y Rearrange Axes Rank 1 _ -- operates on lists of x and the entirety of y -- WHY IS THIS IMPORTANT? ________________________________________ Rearranges the axes of an array y. • (x is an atom) — the axis having index x becomes the new last axis • (x is a list) — the axes having indices x become the new last axes.
Still not entirely clear so let's look at some examples.
ii=: ] {. [: i. 10 #~ # NB. utility verb: make self-indexing array ]y =: ii 2 3 4 0 1 2 3 10 11 12 13 20 21 22 23 100 101 102 103 110 111 112 113 120 121 122 123
This array y is called self-indexing because the value of each element is the base 10 representation of its index in the array. This helps us know we have selected the item we intended to because its index is shown by its value:
y{~0 0 0;0 0 1;0 1 0;1 0 0;1 1 1 0 1 10 100 111
See how the values retrieved match their respective index if the value is represented as a 3-digit base 10 number.
Also, the method for generating this array looks very expensive for larger examples but it's fine for pedagogical purposes.
Resuming our detailed explanation of dyadic transpose, we look at examples from the very simple to the increasingly complex.
$y 2 3 4 $ 0 |: y NB. New shape shows original axis 0 (size=2) is now the last axis 3 4 2 0 |: y NB. Move axis 0 to end. Old axes 1 and 2 become new axes 0 and 1 0 100 1 101 2 102 3 103 10 110 11 111 12 112 13 113 20 120 21 121 22 122 23 123
This shows us that a scalar left argument indexes into the shape of the array to move that axis to the end, leaving the other axes in their same relative order. So, for the three possible values of this for a rank three array, we see:
$&>0 1 2|:&.><y 3 4 2 2 4 3 2 3 4
Now we see how a higher-rank array y behaves in the more general case of a vector x.
$ y=: ii 2 3 4 5 6 2 3 4 5 6 x =: 2 1 NB. axes with sizes 4 and 3 respectively become the last axes $ x |: y 2 5 6 4 3 ________________________________________ More Information 1. Use a special form of x in x |: y to extract the diagonal of matrix y ] y =: 4 4 $ 'abcdefghijklmnop' abcd efgh ijkl mnop x =: < 0 1 x |: y afkp
A Mechanical Way to Determine the Left Argument to Dyadic Transpose
There's perhaps a simpler way to look at dyadic transpose if we consider only left arguments of the same length as the shape of the left argument.
ary=. 2 3 4 5?@$100 NB. Random integer 4D array /: 5 4 3 2 i. $ary NB. “5 4 3 2” = shape of monadic transpose 3 2 1 0
Here we want to transpose the 2 3 4 5 array into a 5 4 3 2 array which is the same thing as monadic transpose: we simple reverse the axes.
$3 2 1 0|:ary 5 4 3 2
However, looking at x as the grade vector for the shape of y is a simple way of looking at it which makes it easy to achieve any rearrangement of axes.
/: 4 2 5 3 i. $ary NB. Random shape we want to achieve is 4 2 5 3 2 0 3 1 $2 0 3 1|:ary NB. Applying the vector which selects the axes in the order we want them 4 2 5 3 /: 3 2 4 5 i. $ary NB. Random shape we want to achieve is 3 2 4 5, so we grade this and use the grade vector 1 0 2 3 $1 0 2 3|:ary NB. as the left argument to dyadic transpose to get the desired shape. 3 2 4 5
For the cases where the desired shape vector is shorter than the number of axes, this method works as a complement to the documented description above because we specify the axes we want to move to the front instead of those we want to move to the end.
/:4 5 i. $ary NB. Move axes with values 4 and 5 to the front. 2 3 0 1 $2 3 0 1 |: ary NB. Desired result 4 5 2 3
Show-and-tell
Dan showed us how he parses XML files from sources with variable formats. This is reminiscent of earlier, simpler XML work we've done.
Working with XML
I typically get a group of XML files that need to be converted into a table form for further processing.
Some are via an API to a custom database, but some are the XML from a flat ods spreadsheet. I find that when Google spreadsheets are downloaded and converted to xls, the tara addon has problems reading them. If I convert to flat ods, I can use xslt.
Snippet of xslt file:
<xsl:output method="text" omit-xml-declaration="yes" indent="no"/> <xsl:template match="/"> <xsl:text>[worksheet] </xsl:text> <xsl:for-each select="office:document/office:body/office:spreadsheet/table:table"> <xsl:value-of select="@table:name" /><xsl:text> </xsl:text> <xsl:text>[data] </xsl:text> <xsl:for-each select="table:table-row"> <xsl:for-each select="table:table-cell"> <xsl:value-of select="text:p" /><xsl:text> </xsl:text> </xsl:for-each> <xsl:text> </xsl:text> </xsl:for-each> </xsl:for-each>
Example of Usage
load '/home/dan/Documents/J_Projects/XML_Processing/Convert_XML_v2.ijs' NB. read xml files (flat ods spreadsheets from office libre) files0 ┌────────────────┬─────────────────┬─────┬───┬──────┬──────────┐ │janejones.fods │2018 2 9 15 23 33│19683│rwx│------│-rwxr-xr-x│ ├────────────────┼─────────────────┼─────┼───┼──────┼──────────┤ │robertsmith.fods│2018 2 9 15 20 47│18811│rwx│------│-rwxr-xr-x│ ├────────────────┼─────────────────┼─────┼───┼──────┼──────────┤ │johndoe.fods │2018 2 9 15 18 57│18850│rwx│------│-rwxr-xr-x│ └────────────────┴─────────────────┴─────┴───┴──────┴──────────┘ NB. transform with xslt ,.shelltext=: ('xsltproc ', xslt1, ' ', ]) each filepath ┌───────────────────────────────────────────────────────────────────────────────────────────────────────... │xsltproc /home/dan/Documents/J_Projects/XML_Processing/XSLT_Files/fods-v3.xslt /home/dan/Documents/J_Pr... ├───────────────────────────────────────────────────────────────────────────────────────────────────────... │xsltproc /home/dan/Documents/J_Projects/XML_Processing/XSLT_Files/fods-v3.xslt /home/dan/Documents/J_Pr... ... ,.sheetdata1=: shell_jtask_ each shelltext ┌───────────────────────────────────────────────────────────────────────────────────────────────────────... │[worksheet] Sheet1 [data] First Jane Last Jones Work Address 125 Wall St, New York, NY 11111 Home Ad... ├───────────────────────────────────────────────────────────────────────────────────────────────────────... ... NB. parse lines ,. each ,.sheetdata2=: <;._2 each (CR -.~ ]) each sheetdata1 ┌────────────────────────────────────────────────────────┐ │┌─────────────────────────────────────────────┐ │ ││[worksheet] │ │ │├─────────────────────────────────────────────┤ │ ││Sheet1 │ │ │├─────────────────────────────────────────────┤ │ ││[data] │ │ │├─────────────────────────────────────────────┤ │ ││First Jane │ │ │├─────────────────────────────────────────────┤ │ ││Last Jones │ │ │├─────────────────────────────────────────────┤ │ ││Work Address 125 Wall St, New York, NY 11111 │ │ │├─────────────────────────────────────────────┤ │ ││Home Address 456 99th St, Brooklyn, NY 12121 │ │ │├─────────────────────────────────────────────┤ │ ││Home Phone 718-555-9999 │ │ │├─────────────────────────────────────────────┤ │ ││Work Phone 212-555-2222 │ │ │├─────────────────────────────────────────────┤ │ ││Mobile 929-555-3333 │ │ │├─────────────────────────────────────────────┤ │ ││Home E-Mail janejones@nowhere.com │ │ │├─────────────────────────────────────────────┤ │ ││Work E-Mail janejones@whizzo.com │ │ │└─────────────────────────────────────────────┘ │ ├────────────────────────────────────────────────────────┤ │┌──────────────────────────────────────────────────────┐│ ││[worksheet] ││ │├──────────────────────────────────────────────────────┤│ ││Sheet1 ││ │├──────────────────────────────────────────────────────┤│ ││[data] ││ │├──────────────────────────────────────────────────────┤│ ... NB. find sheets ]sheetdata2bin=: ((<'[worksheet]') = ]) each sheetdata2 ┌───────────────────────┬─────────────────┬─────────────────┐ │1 0 0 0 0 0 0 0 0 0 0 0│1 0 0 0 0 0 0 0 0│1 0 0 0 0 0 0 0 0│ └───────────────────────┴─────────────────┴─────────────────┘ NB. fret on sheets (these files only have 1 sheet each) ,.sheetdata3=: sheetdata2bin <;._1 each sheetdata2 ┌───────────────────────────────────────────────────────────────────────────────────────────────────────... │┌──────────────────────────────────────────────────────────────────────────────────────────────────────... ││┌──────┬──────┬───────────┬───────────┬─────────────────────────────────────────────┬─────────────────... │││Sheet1│[data]│First Jane │Last Jones │Work Address 125 Wall St, New York, NY 11111 │Home Address 456 ... ... NB. get sheet names ]sheetnames1=: ; each each {. each each sheetdata3 ┌────────┬────────┬────────┐ │┌──────┐│┌──────┐│┌──────┐│ ││Sheet1│││Sheet1│││Sheet1││ │└──────┘│└──────┘│└──────┘│ └────────┴────────┴────────┘ NB. fret on data tag and parse on tabs ]sheetdata3bin=: ((<'[data]') = ]) each each sheetdata3 ┌───────────────────────┬─────────────────┬─────────────────┐ │┌─────────────────────┐│┌───────────────┐│┌───────────────┐│ ││0 1 0 0 0 0 0 0 0 0 0│││0 1 0 0 0 0 0 0│││0 1 0 0 0 0 0 0││ │└─────────────────────┘│└───────────────┘│└───────────────┘│ └───────────────────────┴─────────────────┴─────────────────┘ ,. each each ,.sheetdata4=: ; each sheetdata3bin <;._1 each each sheetdata3 ┌──────────────────────────────────────────────────────────┐ │┌───────────────────────────────────────────────┐ │ ││┌─────────────────────────────────────────────┐│ │ │││First Jane ││ │ ││├─────────────────────────────────────────────┤│ │ │││Last Jones ││ │ ││├─────────────────────────────────────────────┤│ │ │││Work Address 125 Wall St, New York, NY 11111 ││ │ ││├─────────────────────────────────────────────┤│ │ │││Home Address 456 99th St, Brooklyn, NY 12121 ││ │ ││├─────────────────────────────────────────────┤│ │ │││Home Phone 718-555-9999 ││ │ ││├─────────────────────────────────────────────┤│ │ │││Work Phone 212-555-2222 ││ │ ││├─────────────────────────────────────────────┤│ │ │││Mobile 929-555-3333 ││ │ ││├─────────────────────────────────────────────┤│ │ │││Home E-Mail janejones@nowhere.com ││ │ ││├─────────────────────────────────────────────┤│ │ │││Work E-Mail janejones@whizzo.com ││ │ ││└─────────────────────────────────────────────┘│ │ │└───────────────────────────────────────────────┘ │ ├──────────────────────────────────────────────────────────┤ │┌────────────────────────────────────────────────────────┐│ ││┌──────────────────────────────────────────────────────┐││ │││First Robert │││ ││├──────────────────────────────────────────────────────┤││ │││Last Smith │││ ... ,.sheetdata5=: > each each <;._2 each each each sheetdata4 ┌─────────────────────────────────────────────────────────┐ │┌──────────────────────────────────────────────┐ │ ││┌────────────┬───────────────────────────────┐│ │ │││First │Jane ││ │ ││├────────────┼───────────────────────────────┤│ │ │││Last │Jones ││ │ ││├────────────┼───────────────────────────────┤│ │ │││Work Address│125 Wall St, New York, NY 11111││ │ ││├────────────┼───────────────────────────────┤│ │ ... NB. this data has one sheet per file. ; is sufficient to make data mat ,.sheetdata6=: ; each sheetdata5 ┌───────────────────────────────────────────────────────┐ │┌────────────┬───────────────────────────────┐ │ ││First │Jane │ │ │├────────────┼───────────────────────────────┤ │ ││Last │Jones │ │ │├────────────┼───────────────────────────────┤ │ ││Work Address│125 Wall St, New York, NY 11111│ │ │├────────────┼───────────────────────────────┤ │ ││Home Address│456 99th St, Brooklyn, NY 12121│ │ │├────────────┼───────────────────────────────┤ │ ││Home Phone │718-555-9999 │ │ │├────────────┼───────────────────────────────┤ │ ││Work Phone │212-555-2222 │ │ │├────────────┼───────────────────────────────┤ │ ││Mobile │929-555-3333 │ │ │├────────────┼───────────────────────────────┤ │ ││Home E-Mail │janejones@nowhere.com │ │ │├────────────┼───────────────────────────────┤ │ ││Work E-Mail │janejones@whizzo.com │ │ │└────────────┴───────────────────────────────┘ │ ├───────────────────────────────────────────────────────┤ │┌────────────┬────────────────────────────────────────┐│ ││First │Robert ││ │├────────────┼────────────────────────────────────────┤│ ││Last │Smith ││ │├────────────┼────────────────────────────────────────┤│ ││Home Address│123 Broadway Blvd, Shelbyville, CA 99999││ │├────────────┼────────────────────────────────────────┤│ ││Home Phone │213-555-2323 ││ │├────────────┼────────────────────────────────────────┤│ ││Home E-Mail │robert.smith@nowhere.com ││ │├────────────┼────────────────────────────────────────┤│ ││Work E-Mail │robert.smith@acme.com ││ │└────────────┴────────────────────────────────────────┘│ ├───────────────────────────────────────────────────────┤ │┌────────────┬──────────────────────────────────┐ │ ││First │John │ │ │├────────────┼──────────────────────────────────┤ │ ││Last │Doe │ │ │├────────────┼──────────────────────────────────┤ │ ││Home Address│123 Main St, Springfield, CA 99999│ │ │├────────────┼──────────────────────────────────┤ │ ││Home Phone │213-555-1212 │ │ │├────────────┼──────────────────────────────────┤ │ ││Mobile │213-555-4444 │ │ │├────────────┼──────────────────────────────────┤ │ ││Home E-Mail │john.doe@nowhere.com │ │ │└────────────┴──────────────────────────────────┘ │ └───────────────────────────────────────────────────────┘ NB. use pull mat to pick data sheetdata6 pullmat 'First';'Last' ┌────────────┬──────────────┬──────────┐ │┌────┬─────┐│┌──────┬─────┐│┌────┬───┐│ ││Jane│Jones│││Robert│Smith│││John│Doe││ │└────┴─────┘│└──────┴─────┘│└────┴───┘│ └────────────┴──────────────┴──────────┘ NB. process as needed ]namelist=: ([: < ([: > [) , ' ' , [: > ]) / each sheetdata6 pullmat 'First';'Last' ┌────────────┬──────────────┬──────────┐ │┌──────────┐│┌────────────┐│┌────────┐│ ││Jane Jones│││Robert Smith│││John Doe││ │└──────────┘│└────────────┘│└────────┘│ └────────────┴──────────────┴──────────┘
Sample Code
We see that Dan accomplishes this prodigious task of parsing without too much code.
NB. load '/home/dan/Documents/J_Projects/XML_Processing/Convert_XML_v2.ijs' NB. ----------------------------------------------------------------------- NB. process XML to a table NB. ----------------------------------------------------------------------- NB. ------------------------------------------------------------------------- NB. Start Constants NB. ------------------------------------------------------------------------- xmlpath=: '/home/dan/Documents/J_Projects/XML_Processing/Sample_Data/' xslt1=: '/home/dan/Documents/J_Projects/XML_Processing/XSLT_Files/fods-v3.xslt' progpath=: '/home/dan/Documents/J_Projects/XML_Processing/' NB. ------------------------- end constants ----------------------------- NB. ------------------------------------------------------------------------- NB. Start Functions NB. ------------------------------------------------------------------------- pullmat =: 4 : '((y i.~ ]) each {. " 1 each x) { each {. " 1 each }. " 1 each (a:,~ ]) each x' pulltable=: ([: }. a: ,.~ [) {~"1 ] i.~ [: {. [ NB. ------------------------- end functions ----------------------------- files0=: fdir xmlpath,'*.fods' files=: {. " 1 files0 filepath=: (xmlpath, ]) each files NB. transform files with xslt shelltext=: ('xsltproc ', xslt1, ' ', ]) each filepath sheetdata1=: shell_jtask_ each shelltext sheetdata2=: <;._2 each (CR -.~ ]) each sheetdata1 sheetdata2bin=: ((<'[worksheet]') = ]) each sheetdata2 sheetdata3=: sheetdata2bin <;._1 each sheetdata2 sheetnames1=: ; each each {. each each sheetdata3 sheetdata3bin=: ((<'[data]') = ]) each each sheetdata3 sheetdata4=: ; each sheetdata3bin <;._1 each each sheetdata3 sheetdata5=: > each each <;._2 each each each sheetdata4 NB. assuming data was 1 sheet per file sheetdata6=: ; each sheetdata5 NB. create list of combined first and last names namelist=: ([: < ([: > [) , ' ' , [: > ]) / each sheetdata6 pullmat 'First';'Last' NB. get list of area codes areacode=: (] {.~ '-' i.~ ]) each each sheetdata6 pullmat 'Home Phone';'Work Phone';'Mobile' NB. convert to table tableheader=: <;._2 'First|Last|Home E-Mail|Work E-Mail|Home Phone|Work Phone|Mobile|Home Address|Work Address|' tabledata=: > sheetdata6 pullmat tableheader table=: tableheader, tabledata
Advanced topics
We get the following [https://quantdare.com/correlation-prices-returns/ from this site'.
Correlation with Prices or Returns: That is the Question
Thought you knew everything about correlation? Think there’s no fooling you with the question of correlation with financial prices or returns? Well maybe, just maybe, this post will enlighten you.
Correlation: the debate is on
Correlation can be a controversial topic. Things can go awry when two seemingly unrelated variables appear to move in a similar pattern and are found to be correlated. Take a look here at some unusual examples. My personal favourite is the clear relationship between the age of Miss America winners and the number of murders by hot things. There’s no denying it folks, just take a look for yourselves…
Although there must be similar cases with financial series (and I’d be interested to know of any) this post focuses on another tricky aspect of correlation in finance. We take a look at a typical mistake made by most finance newbies: calculating correlation with prices instead of returns. We’ve all been there.
[This point should be emphasized more: absolute price almost does not matter - returns are what an investor cares about.]
You’ve just begun your quant career and been made aware of your mistake; “you should use returns not prices for correlation”. And you accept it without a second thought and continue with your research, right? Well, now is your chance to take a closer look at that pesky correlation and prepare to be amazed.
But hold on a second, why are we even interested in correlation?
Correlation is the key to diversity
Who hasn’t heard the phrase “diversify your portfolio”? Diversification is pretty much number one priority in financial management (after making money, of course). The concept of not putting all your eggs in one basket is not new and it makes complete sense to control risk by spreading investments. Diversifying methods vary from selecting different asset classes (funds, bonds, stocks, etc.), combining industries, or varying the risk levels of investments. And the most common and direct diversification measurement used in these methods is correlation.
A simple Decision
From the point of view of an investor, what would you do given these possible asset investments?
Your first reaction is probably “invest in assets A and B, because C doesn’t look as good”. Then after a moment, you think “but A and B look highly correlated, so maybe A and C would be better”. But how would you feel if I told you that in fact A and B are perfectly negatively correlated and A and C perfectly positively correlated? A little confused, maybe? Not buying it? Let’s put the returns in a scatter plot:
That’s what I said: A and B have negative correlation and A and C positive correlation (and the points lie on exact straight lines). But you’re thinking: “the prices look positively correlated”. Yes, something strange is going on here.
Misconceptions
Don’t worry; you’re not the only one confused. Correlation, despite its apparent simplicity, is often misinterpreted even by experienced academics and investors. One misconception is that extreme values of correlation imply the movements of two series are in exact opposite directions (for -1) or the same direction (for +1). But this is not correct. Assets A and C are perfectly positively correlated. You would then often hear people say “A and C move up and down together”. But not so fast… for small positive returns of asset A (less than 1%) asset C has negative returns. Hmmm… Not as common is the belief that the magnitude of the movements is the same for series with ±1 correlation. This is also not correct.
Assets A and B are perfectly negatively correlated. Some may say “B moves the same amount as A but in the opposite direction”. Nope again. When A moves 4% B moves close to 0%.
Wait, so what did we miss? Let’s go back to basics.
What is Correlation?
Correlation is how closely variables are related. The Pearson correlation coefficient is its most common statistic and it measures the degree of linear relationship between two variables. Its values range between -1 (perfect negative correlation) and 1 (perfect positive correlation). Zero correlation implies no relationship between variables.
It is defined as the covariance between two variables, say XX and YY, divided by the product of the standard deviations of each. Covariance is an unbounded statistic of how the variables change together, while standard deviation is a measure of data dispersion from its average.
This formula can be estimated for a sample by:
where xt and yt are the values of X and Y at time t. The sample means of X and Y are x¯ and y¯ respectively.
Uncovering the Mystery
Looking carefully at this last formula we see all the bracketed terms are differences to the variable average, so correlation is a comparison of the deviations from the means and not of the variations in the raw data itself. Hence, Pearson actually measures whether the variables are above or below their average at the same time. The term (xt−x¯)(yt−y¯) is positive if both series are above (or below) their average together (and note the denominator is always positive). So a correct statement of perfect positive correlation would be “the upward deviations from the mean of asset A returns are simultaneous to upward deviations from the mean of asset B returns, and similarly with downward deviations”.
This isn’t as intuitive as the typical “asset B goes up and down with asset A” and it is certainly not as easy to visualise. It’s no wonder correlation can be misleading.
Removing the Mean
Let’s go back to our example. The asset prices were created to follow geometric Brownian motions with a trend component and an irregular component. All three series have strong, positive, constant trend components, hence their upward random walks (A and B have the same magnitude and C has half). The irregular components are generated with the same series of random numbers but their sign, have been inverted for B. These settings ensure the extreme correlations between the series. If we create two new series E and F with trend components set to zero then the upward bias is removed in the prices but the correlation on the returns stays the same. This is because the trend component doesn’t matter in the correlation calculation since it compares deviations from the mean returns, or in other words, from the trend.
The difference is that all upward returns in asset E do correspond to downward returns in asset F, and vice versa. This is like shifting the axes in the first scatter plot and centering them on the means of the series of A and B. This shifting concept can be applied to the correlation calculation by removing the means from the formula:
Instead of comparing deviations from the series’ averages we are directly comparing the values themselves. Using this QuantDare formula, we have the following correlations on the asset returns:
Well, it kind of makes more sense looking at the price series, but they’re very different to the Pearson coefficients. But hold on a second, wasn’t this post about correlation of prices and returns? Prices vs returns Yes, let’s get back to that. Thinking about Pearson’s formula, it’s more likely that deviations from average prices are above and below at the same time since financial series usually have an upward bias together. Due to this, price correlations tend to be positive.
Also, prices are not independent. Let Pt be the price of an asset at time t and then the time series can be written as:
P0,P1,P2,…,PT.
Let Rt be the return at time t: Rt=Pt−Pt−1. Then we can rewrite the price series as:
P0,P0+R1,P0+R1+R2,…,P0+R1+…+RT.
Imagine correlation calculated over these prices. The first return R1 contributes to all the following entries and impacts every data point. On the other hand, the last return RT only contributes to one. In this way, early changes in the prices have more weight than later changes in the correlation calculation whereas with the returns each one has equal importance. For this reason, correlation with prices is more sensitive to the number of time periods it’s calculated over. Using our asset examples, the Pearson correlation coefficients over prices are more in line with the visual perception. The magnitudes are different, but the signs coincide with the QuantDare formula with returns. This QD formula, however, doesn’t work with prices. It always produces positive correlations since it requires stationary series.
Which Correlation Calculation Convinces us More?
Well, it all depends on the relationship you’re interested in comparing. Short-term changes are better interpreted from returns correlations, whilst valorations of long-term evolutions may be improved using prices. And if what you really want is to analyse if two series move up and down together, then you should replace the Pearson coefficient with the QuantDare formula over the return series.
The most important thing with correlation is to really understand what is being measured and give the correct interpretation. It is such a common statistic used by professionals and laymen alike in all kinds of fields; it is easy to build a false confidence around its meaning and make inaccurate statements or misleading conclusions.
But maybe, just maybe, this post will help to avoid future confusion and misinterpretation of this useful measure of relationship.
Learning and Teaching J
We heard from an old friend:
Doing What is Expected
from: Jose Mario Quintana <jose.mario.quintana@gmail.com> via srs.acm.org to: chat@jsoftware.com date: Thu, Dec 14, 2017 at 12:57 PM subject: [Jchat] Reference manuals (was [Jprogramming] i. (2 2 $ 1 2 3 4))
I want to clarify that I was not recommending to ignore the Dictionary. I was just trying to state that I learned J initially by reading an early 90's version of the Dictionary (and I mean the actual Dictionary, not the Introduction that is at the beginning of the document which I only read partially).
Someone else made once the argument that trying to learn the J language by reading (just) the J Dictionary is akin to trying to learn the English language (just) by reading an English dictionary.
I consult the Dictionary infrequently but I can get away with it because almost always J does what I expected to do (even in wicked contexts that were not meant to be). This seems to be no accident:
I asked Ken, I think it may have been at the HOPL Conference, “What is the touchstone to making an elegant programming language?” He said, “The secret is, it has to do what you expect it to do.”
If you stop and think about APL and if you stop and think about J and if you think about Ken’s work generally, it is that high degree of consistency which is the product of an exceptionally clean mind, and a fierce determination not to invent any new constructs, until you have to.
— Fred Brooks, A Celebration of Kenneth Iverson, 2004-11-30