How To Apply The Hurst Exponent In Python In A Rolling Window
Solution 1:
Let me offer you a two step way forwards:
Step 1: a bit more robust Hurst Exponent implementation with test data
Step 2: a simple way to produce a "sliding-window"-alike calculation
Step 3: a bit more complicated way - if a ROLLING WINDOW is a MUST ...
Bonus: What should I write under the code of my question to have it done?
Step 1: a bit more robust Hurst Exponent implementation with test data :
Here, I will post a function implementation, taken from QuantFX
module, as-is ( Py2.7 will not make troubles in most places, yet any xrange()
ought be replaced with range()
in Py3.x ).
This code contains a few improvements and some sort of self-healing, if tests show, that there are problems with data-segment ( QuantFX
uses a convention of a natural flow of the time, where data[0]
is the "oldest" time-series cell and data[-1]
being the "most recent" one ).
Calling the HurstEXP()
without any parameter will yield a demo-run, showing some tests and explanations of the subject matter.
Also the print( HurstEXP.__doc__ )
is self-explanatory:
defHurstEXP( ts = [ None, ] ): # TESTED: HurstEXP() Hurst exponent ( Browninan Motion & other observations measure ) 100+ BARs back(!)""" __doc__
USAGE:
HurstEXP( ts = [ None, ] )
Returns the Hurst Exponent of the time series vector ts[]
PARAMETERS:
ts[,] a time-series, with 100+ elements
( or [ None, ] that produces a demo run )
RETURNS:
float - a Hurst Exponent approximation,
as a real value
or
an explanatory string on an empty call
THROWS:
n/a
EXAMPLE:
>>> HurstEXP() # actual numbers will vary, as per np.random.randn() generator used
HurstEXP( Geometric Browian Motion ): 0.49447454
HurstEXP( Mean-Reverting Series ): -0.00016013
HurstEXP( Trending Series ): 0.95748937
'SYNTH series demo ( on HurstEXP( ts == [ None, ] ) ) # actual numbers vary, as per np.random.randn() generator'
>>> HurstEXP( rolling_window( aDSEG[:,idxC], 100 ) )
REF.s:
>>> www.quantstart.com/articles/Basics-of-Statistical-Mean-Reversion-Testing
"""#---------------------------------------------------------------------------------------------------------------------------<self-reflective>if ( ts[0] == None ): # DEMO: Create a SYNTH Geometric Brownian Motion, Mean-Reverting and Trending Series:
gbm = np.log( 1000 + np.cumsum( np.random.randn( 100000 ) ) ) # a Geometric Brownian Motion[log(1000 + rand), log(1000 + rand + rand ), log(1000 + rand + rand + rand ),... log( 1000 + rand + ... )]
mr = np.log( 1000 + np.random.randn( 100000 ) ) # a Mean-Reverting Series [log(1000 + rand), log(1000 + rand ), log(1000 + rand ),... log( 1000 + rand )]
tr = np.log( 1000 + np.cumsum( 1 + np.random.randn( 100000 ) ) ) # a Trending Series [log(1001 + rand), log(1002 + rand + rand ), log(1003 + rand + rand + rand ),... log(101000 + rand + ... )]# Output the Hurst Exponent for each of the above SYNTH seriesprint ( "HurstEXP( Geometric Browian Motion ): {0: > 12.8f}".format( HurstEXP( gbm ) ) )
print ( "HurstEXP( Mean-Reverting Series ): {0: > 12.8f}".format( HurstEXP( mr ) ) )
print ( "HurstEXP( Trending Series ): {0: > 12.8f}".format( HurstEXP( tr ) ) )
return ( "SYNTH series demo ( on HurstEXP( ts == [ None, ] ) ) # actual numbers vary, as per np.random.randn() generator" )
""" # FIX:
===================================================================================================================
|
|>>> QuantFX.HurstEXP( QuantFX.DATA[ :1000,QuantFX.idxH].tolist() )
0.47537688039105963
|
|>>> QuantFX.HurstEXP( QuantFX.DATA[ :101,QuantFX.idxH].tolist() )
-0.31081076640420308
|
|>>> QuantFX.HurstEXP( QuantFX.DATA[ :100,QuantFX.idxH].tolist() )
nan
|
|>>> QuantFX.HurstEXP( QuantFX.DATA[ :99,QuantFX.idxH].tolist() )
Intel MKL ERROR: Parameter 6 was incorrect on entry to DGELSD.
C:\Python27.anaconda\lib\site-packages\numpy\lib\polynomial.py:594: RankWarning: Polyfit may be poorly conditioned
warnings.warn(msg, RankWarning)
0.026867491053098096
"""pass; too_short_list = 101 - len( ts ) # MUST HAVE 101+ ELEMENTSif ( 0 < too_short_list ): # IF NOT:
ts = too_short_list * ts[:1] + ts # PRE-PEND SUFFICIENT NUMBER of [ts[0],]-as-list REPLICAS TO THE LIST-HEAD#---------------------------------------------------------------------------------------------------------------------------
lags = range( 2, 100 ) # Create the range of lag values
tau = [ np.sqrt( np.std( np.subtract( ts[lag:], ts[:-lag] ) ) ) for lag in lags ] # Calculate the array of the variances of the lagged differences#oly = np.polyfit( np.log( lags ), np.log( tau ), 1 ) # Use a linear fit to estimate the Hurst Exponent#eturn ( 2.0 * poly[0] ) # Return the Hurst exponent from the polyfit output""" ********************************************************************************************************************************************************************* DONE:[MS]:ISSUE / FIXED ABOVE
|>>> QuantFX.HurstEXP( QuantFX.DATA[ : QuantFX.aMinPTR,QuantFX.idxH] )
C:\Python27.anaconda\lib\site-packages\numpy\core\_methods.py:82: RuntimeWarning: Degrees of freedom <= 0 for slice
warnings.warn("Degrees of freedom <= 0 for slice", RuntimeWarning)
C:\Python27.anaconda\lib\site-packages\numpy\core\_methods.py:94: RuntimeWarning: invalid value encountered in true_divide
arrmean, rcount, out=arrmean, casting='unsafe', subok=False)
C:\Python27.anaconda\lib\site-packages\numpy\core\_methods.py:114: RuntimeWarning: invalid value encountered in true_divide
ret, rcount, out=ret, casting='unsafe', subok=False)
QuantFX.py:23034: RuntimeWarning: divide by zero encountered in log
return ( 2.0 * np.polyfit( np.log( lags ), np.log( tau ), 1 )[0] ) # Return the Hurst exponent from the polyfit output ( a linear fit to estimate the Hurst Exponent )
Intel MKL ERROR: Parameter 6 was incorrect on entry to DGELSD.
C:\Python27.anaconda\lib\site-packages\numpy\lib\polynomial.py:594: RankWarning: Polyfit may be poorly conditioned
warnings.warn(msg, RankWarning)
0.028471879418359915
|
|
|# DATA:
|
|>>> QuantFX.DATA[ : QuantFX.aMinPTR,QuantFX.idxH]
memmap([ 1763.31005859, 1765.01000977, 1765.44995117, 1764.80004883,
1765.83996582, 1768.91003418, 1771.04003906, 1769.43994141,
1771.4699707 , 1771.61999512, 1774.76000977, 1769.55004883,
1773.4699707 , 1773.32995605, 1770.08996582, 1770.20996094,
1768.34997559, 1768.02001953, 1767.59997559, 1767.23999023,
1768.41003418, 1769.06994629, 1769.56994629, 1770.7800293 ,
1770.56994629, 1769.7800293 , 1769.90002441, 1770.44995117,
1770.9699707 , 1771.04003906, 1771.16003418, 1769.81005859,
1768.76000977, 1769.39001465, 1773.23999023, 1771.91003418,
1766.92004395, 1765.56994629, 1762.65002441, 1760.18005371,
1755. , 1756.67004395, 1753.48999023, 1753.7199707 ,
1751.92004395, 1745.44995117, 1745.44995117, 1744.54003906,
1744.54003906, 1744.84997559, 1744.84997559, 1744.34997559,
1744.34997559, 1743.75 , 1743.75 , 1745.23999023,
1745.23999023, 1745.15002441, 1745.31005859, 1745.47998047,
1745.47998047, 1749.06994629, 1749.06994629, 1748.29003906,
1748.29003906, 1747.42004395, 1747.42004395, 1746.98999023,
1747.61999512, 1748.79003906, 1748.79003906, 1748.38000488,
1748.38000488, 1744.81005859, 1744.81005859, 1736.80004883,
1736.80004883, 1735.43005371, 1735.43005371, 1737.9699707
], dtype=float32
)
|
|
| # CONVERTED .tolist() to avoid .memmap-type artifacts:
|
|>>> QuantFX.DATA[ : QuantFX.aMinPTR,QuantFX.idxH].tolist()
[1763.31005859375, 1765.010009765625, 1765.449951171875, 1764.800048828125, 1765.8399658203125, 1768.9100341796875, 1771.0400390625, 1769.43994140625, 1771.469970703125, 1771.6199951171875, 1774.760
859375, 1743.75, 1743.75, 1745.239990234375, 1745.239990234375, 1745.1500244140625, 1745.31005859375, 1745.47998046875, 1745.47998046875, 1749.0699462890625, 1749.0699462890625, 1748.2900390625, 174
|
|>>> QuantFX.HurstEXP( QuantFX.DATA[ : QuantFX.aMinPTR,QuantFX.idxH].tolist() )
C:\Python27.anaconda\lib\site-packages\numpy\core\_methods.py:116: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)
Intel MKL ERROR: Parameter 6 was incorrect on entry to DGELSD.
C:\Python27.anaconda\lib\site-packages\numpy\lib\polynomial.py:594: RankWarning: Polyfit may be poorly conditioned
warnings.warn(msg, RankWarning)
0.028471876494884543
===================================================================================================================
|
|>>> QuantFX.HurstEXP( QuantFX.DATA[ :1000,QuantFX.idxH].tolist() )
0.47537688039105963
|
|>>> QuantFX.HurstEXP( QuantFX.DATA[ :101,QuantFX.idxH].tolist() )
-0.31081076640420308
|
|>>> QuantFX.HurstEXP( QuantFX.DATA[ :100,QuantFX.idxH].tolist() )
nan
|
|>>> QuantFX.HurstEXP( QuantFX.DATA[ :99,QuantFX.idxH].tolist() )
Intel MKL ERROR: Parameter 6 was incorrect on entry to DGELSD.
C:\Python27.anaconda\lib\site-packages\numpy\lib\polynomial.py:594: RankWarning: Polyfit may be poorly conditioned
warnings.warn(msg, RankWarning)
0.026867491053098096
"""return ( 2.0 * np.polyfit( np.log( lags ), np.log( tau ), 1 )[0] ) # Return the Hurst exponent from the polyfit output ( a linear fit to estimate the Hurst Exponent )
Step 2: a simple way to produce a "sliding-window" calculation :
[ ( -i, HurstEXP( ts = df['Close'][:-i] ) ) for i in range( 1, 200 ) ] # should call the HurstEXP for the last 200 days
TEST-ME:
>>> df[u'Close']
Date
1993-01-2943.9375001993-02-01 44.250000
...
2019-07-17297.7399902019-07-18297.429993
Name: Close, Length: 6665, dtype: float64
>>> >>> [ ( -i,
HurstEXP( df[u'Close'][:-i] )
) for i inrange( 1, 10 )
]
[ ( -1, 0.4489364467179827 ),
( -2, 0.4489306967683502 ),
( -3, 0.44892205577752986 ),
( -4, 0.448931424819551 ),
( -5, 0.44895272101162326 ),
( -6, 0.44896713741862954 ),
( -7, 0.44898211557287204 ),
( -8, 0.4489941656580211 ),
( -9, 0.4490116318052649 )
]
Step 3: a bit more complicated way - if a ROLLING WINDOW is a MUST ... :
While not much memory / processing efficient, the "rolling window" trick may get injected into the game, whereas there is no memory, the less a processing efficiency benefit from doing so ( you spend a lot on syntactically plausible code, yet the processing efficiency does not get here any plus from doing it right this way, as convolved nature of HurstEXP()
cannot help, without an attempt to re-vectorise also its internal code (why and what for ever?) any better from this ... just if professor or boss still wants you to do so ... ):
defrolling_window( aMatrix, aRollingWindowLENGTH ): #""" __doc__
USAGE: rolling_window( aMatrix, aRollingWindowLENGTH )
PARAMS: aMatrix a numpy array
aRollingWindowLENGTH a LENGTH of a rolling window
RETURNS: a stride_trick'ed numpy array with rolling windows
THROWS: n/a
EXAMPLE: >>> x = np.arange( 10 ).reshape( ( 2, 5 ) )
>>> rolling_window( x, 3 )
array([[[0, 1, 2], [1, 2, 3], [2, 3, 4]],
[[5, 6, 7], [6, 7, 8], [7, 8, 9]]])
>>> np.mean( rolling_window( x, 3 ), -1 )
array([[ 1., 2., 3.],
[ 6., 7., 8.]])
"""
new_shape = aMatrix.shape[:-1] + ( aMatrix.shape[-1] - aRollingWindowLENGTH + 1, aRollingWindowLENGTH )
new_strides = aMatrix.strides + ( aMatrix.strides[-1], )
return np.lib.stride_tricks.as_strided( aMatrix,
shape = new_shape,
strides = new_strides
)
>>> rolling_window( df[u'Close'], 100 ).shape
(6566, 100)
>>> rolling_window( df[u'Close'], 100 ).flags
C_CONTIGUOUS : False
F_CONTIGUOUS : False
OWNDATA : False <---------------- a VIEW, not a replica
WRITEABLE : True
ALIGNED : True
WRITEBACKIFCOPY : False
UPDATEIFCOPY : False
You get an array of 6566 vectors with "rolling_window"-ed 100-day blocks of of SPY[Close]-s
>>>rolling_window( df[u'Close'], 100 )
array([[ 43.9375 , 44.25 , 44.34375 , ..., 44.5 , 44.59375 , 44.625 ],
[ 44.25 , 44.34375 , 44.8125 , ..., 44.59375 , 44.625 , 44.21875 ],
[ 44.34375 , 44.8125 , 45. , ..., 44.625 , 44.21875 , 44.8125 ],
...,
[279.14001465, 279.51998901, 279.32000732, ..., 300.6499939 , 300.75 , 299.77999878],
[279.51998901, 279.32000732, 279.20001221, ..., 300.75 , 299.77999878, 297.73999023],
[279.32000732, 279.20001221, 278.67999268, ..., 299.77999878, 297.73999023, 297.42999268]])
Q: What should I write under the code of my question to have it done?
for aRowINDEX inrange( 1, 200 ):
df[u'HurstEXP_COLUMN'][-aRowINDEX] = HurstEXP( df[u'Close'][:-aRowINDEX] )
print( "[{0:>4d}]: DIFF( hurst() - HurstEXP() ) == {1:}".format( aRowINDEX,
( hurst( df[u'Close'][:-aRowINDEX] )
- HurstEXP( df[u'Close'][:-aRowINDEX] )
)
)
Post a Comment for "How To Apply The Hurst Exponent In Python In A Rolling Window"