library(emhawkes)
This subsection explaines how to construct, simulate, and estimate a univariate Hawkes model. First, create a hspec
which defines the Hawkes model. S4 class hspec
contains slots of model parameters, mu
, alpha
, beta
, dimens
, rmark
, and impact
.
The parameters of the model, mu
, alpha
, beta
, are defined by matrices. The following is an example of a univariate model (without mark).
set.seed(1107)
<- 0.3; alpha1 <- 1.2; beta1 <- 1.5
mu1 <- new("hspec", mu=mu1, alpha=alpha1, beta=beta1)
hspec1 show(hspec1)
#> An object of class "hspec"
#> Slot "mu":
#> [,1]
#> [1,] 0.3
#>
#> Slot "alpha":
#> [,1]
#> [1,] 1.2
#>
#> Slot "beta":
#> [,1]
#> [1,] 1.5
#>
#> Slot "dimens":
#> [1] 1
#>
#> Slot "rmark":
#> function (...)
#> 1
#> <environment: 0x0000000014fdcac8>
#>
#> Slot "dmark":
#> NULL
#>
#> Slot "impact":
#> NULL
#>
#> Slot "type_col_map":
#> NULL
The function hsim
implements simulation where the input arguments are hspec
, size
and the initial values of intensity lambda
and Hawkes processes N
. If the initial values of lambda
are omitted, the internally determined initial values are used. The default initial value of N
is zero.
<- hsim(hspec1, size=100)
res1 summary(res1)
#> ------------------------------------`------
#> Simulation result of marked Hawkes model.
#> Realized path (with right continuous representation):
#> arrival N1 lambda1
#> [1,] 0.00000 0 0.90000
#> [2,] 0.97794 1 0.43838
#> [3,] 1.09001 2 1.43128
#> [4,] 1.28999 3 2.02711
#> [5,] 1.53225 4 2.33527
#> [6,] 1.65001 5 3.01139
#> [7,] 2.51807 6 1.36377
#> [8,] 2.81710 7 1.74553
#> [9,] 2.87547 8 2.72378
#> [10,] 3.16415 9 2.65016
#> [11,] 3.51378 10 2.40131
#> [12,] 4.22355 11 1.43843
#> [13,] 16.96752 12 0.30000
#> [14,] 17.71654 13 0.69015
#> [15,] 19.10293 14 0.49874
#> [16,] 24.06354 15 0.30082
#> [17,] 24.09256 16 1.44967
#> [18,] 28.40173 17 0.30366
#> [19,] 28.53743 18 1.28198
#> [20,] 28.56702 19 2.38725
#> ... with 80 more rows
#> ------------------------------------------
The results of hsim
is an S3 class hreal
which consists of hspec
, inter_arrival
, arrival
, type
, mark
, N
, Nc
, lambda
, lambda_component
, rambda
, rambda\_component
. The components inter_arrival
, type
, mark
, N
, Nc
, lambda
, lambda_component
, rambda
, rambda_component
can be excessed during simulation and estimation. The column names of N
are N1
, N2
, N3
and so on. Similarly, the column names of lambda
are lambda1
, lambda2
, lambda3
and so on. The column names of lambda_component
are lambda_component11
, lambda_component12
, lambda_component13
and so on. inter_arrival
, type
, mark
, N
, Nc
start at zero. Using summary
function, one can print the first 20 elements of arrival
, N
and lambda
.
hspec
is the model specification, inter_arrival
is the inter-arrival time of every event, and arrival
is the cumulative sum of inter_arrival
. type
is the type of events, i.e., \(i\) for \(N_i\), and mark
is a numeric vector which represents additional information for the event. lambda
represents \(\bm{\lambda}\) which is the left continuous and right limit version. The right continuous version of intensity is rambda
. lambda_component
represents \(\lambda_{ij}\) and rambda_component
is the right continuous version.
The log-likelihood function is computed by logLik
method. In this case, the inter-arrival times and hspec
are inputs of the function.
logLik(hspec1, inter_arrival = res1$inter_arrival)
#> [,1]
#> [1,] -28.09111
The likelihood estimation is performed using mhfit
function. The specification of the initial values of the parameters, mhspec0
is needed. In the following example, hspec0
is set to be hspec1
, which is defined previously, for simplicity, but any candidates for the starting value of the numerical procedure can be used.
Note that only inter_arrival
is needed. (Indeed, for more precise simulation, lambda0
, the inital value of lambda compoment, should be specified. If not, internally determined initial values are set.)
<- 0.5; alpha0 <- 1.0; beta0 <- 1.8
mu0 <- new("hspec", mu=mu0, alpha=alpha0, beta=beta0)
hspec0 <- hfit(hspec0, inter_arrival = res1$inter_arrival)
mle summary(mle)
#> --------------------------------------------
#> Maximum Likelihood estimation
#> BFGS maximization, 44 iterations
#> Return code 0: successful convergence
#> Log-Likelihood: -26.68325
#> 3 free parameters
#> Estimates:
#> Estimate Std. error t value Pr(> t)
#> mu1 0.29680 0.09286 3.196 0.00139 **
#> alpha1 1.75745 0.40859 4.301 1.70e-05 ***
#> beta1 2.14300 0.49441 4.334 1.46e-05 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> --------------------------------------------
In a bivariate model, the parameters, the slots of hspec
, are matrices. mu
is 2-by-1, and alpha
and beta
are 2-by-2 matrices. rmark
is a random number generating function. lambda0
, 2-by-2 matrix, represents the initial values of lambda_component
, a set of lambda11, lambda12, lambda21, lambda22
. The intensity processes are represented by
\[ \lambda_1(t) = \mu_1 + \lambda_{11}(t) + \lambda_{12}(t) \]
\[ \lambda_2(t) = \mu_2 + \lambda_{21}(t) + \lambda_{22}(t) \]
\(\lambda_{ij}\) called lambda components and lambda0
is the time zero values of \(lambda_{ij}\), i.e., \(\lambda_{ij}(0)\). lambda0
can be omitted and then internally determined initial values are used.
<- matrix(c(0.2), nrow = 2)
mu2 <- matrix(c(0.5, 0.9, 0.9, 0.5), nrow = 2, byrow = TRUE)
alpha2 <- matrix(c(2.25, 2.25, 2.25, 2.25), nrow = 2, byrow = TRUE)
beta2 <- matrix(c(0.1, 0.1, 0.1, 0.1), nrow = 2, byrow = TRUE)
lambda0 <- new("hspec", mu=mu2, alpha=alpha2, beta=beta2)
hspec2 print(hspec2)
#> An object of class "hspec"
#> Slot "mu":
#> [,1]
#> [1,] 0.2
#> [2,] 0.2
#>
#> Slot "alpha":
#> [,1] [,2]
#> [1,] 0.5 0.9
#> [2,] 0.9 0.5
#>
#> Slot "beta":
#> [,1] [,2]
#> [1,] 2.25 2.25
#> [2,] 2.25 2.25
#>
#> Slot "dimens":
#> [1] 2
#>
#> Slot "rmark":
#> function (...)
#> 1
#> <bytecode: 0x000000001d5681f8>
#> <environment: 0x0000000016347bb0>
#>
#> Slot "dmark":
#> NULL
#>
#> Slot "impact":
#> NULL
#>
#> Slot "type_col_map":
#> NULL
To simulate, use function mhsim
.
<- hsim(hspec2, size=1000)
res2 #> Warning in hsim(hspec2, size = 1000): The initial values for intensity processes are not provided. Internally determined initial values are used.
summary(res2)
#> ------------------------------------`------
#> Simulation result of marked Hawkes model.
#> Realized path (with right continuous representation):
#> arrival N1 N2 lambda1 lambda2
#> [1,] 0.0000 0 0 0.52941 0.52941
#> [2,] 2.4745 1 0 0.20126 0.20126
#> [3,] 2.5048 1 1 0.66824 1.04190
#> [4,] 2.6013 1 2 1.30125 1.28004
#> [5,] 2.6762 1 3 1.89085 1.53497
#> [6,] 2.6925 1 4 2.69755 1.96890
#> [7,] 3.0421 2 4 1.74738 1.23334
#> [8,] 3.2604 3 4 1.45274 1.38296
#> [9,] 3.3405 4 4 1.66374 1.93951
#> [10,] 3.4042 4 5 1.90135 2.48684
#> [11,] 3.4359 4 6 2.62234 2.79506
#> [12,] 3.6821 5 6 2.10926 1.97865
#> [13,] 3.9600 6 6 1.48929 1.63345
#> [14,] 4.1582 6 7 1.34558 1.69398
#> [15,] 4.3863 7 7 1.42426 1.39337
#> [16,] 4.5365 8 7 1.42972 1.69297
#> [17,] 4.9093 8 8 0.94771 1.23441
#> [18,] 5.0411 9 8 1.42489 1.34067
#> [19,] 5.0862 9 9 1.75860 2.04393
#> [20,] 5.0943 9 10 2.61395 2.50137
#> ... with 980 more rows
#> ------------------------------------------
Frome the result, we get a vector of realized inter_arrival
and type
. Bivariate model requires inter_arrival
and type
for estimation.
<- res2$inter_arrival
inter_arrival2 <- res2$type type2
Log-likelihood is computed by a function logLik
.
logLik(hspec2, inter_arrival = inter_arrival2, type = type2)
#> Warning in .local(object, ...): The initial values for intensity processes are not provided. Internally determined initial values are used.
#> [1] -1044.734
A log-likelihood estimation is performed using hfit
. In the following, the values of parameter slots in mhspec0
, such as mu, alpha, beta
, are regarded as a starting point of the numerical optimization. For simplicity, we use hspec0 <- hspec2
. Since the true parameter values are not known in the actual problem, the initial value should be guessed. The realized inter_arrival
and type
are used.
<- hspec2
hspec0 <- hfit(hspec0, inter_arrival = inter_arrival2, type = type2)
mle summary(mle)
#> --------------------------------------------
#> Maximum Likelihood estimation
#> BFGS maximization, 36 iterations
#> Return code 0: successful convergence
#> Log-Likelihood: -1041.741
#> 4 free parameters
#> Estimates:
#> Estimate Std. error t value Pr(> t)
#> mu1 0.18119 0.01502 12.064 < 2e-16 ***
#> alpha1.1 0.48349 0.07535 6.417 1.39e-10 ***
#> alpha1.2 0.97070 0.09770 9.935 < 2e-16 ***
#> beta1.1 2.09494 0.18481 11.336 < 2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> --------------------------------------------
This subsection explains about the relation between parameter setting and estimation procedure in two-dimensional Hawkes model. The number of parameters to be estimated in the model depends on how we set the parameter slots such as alpha
and beta
in mhspec0
, the sepcification for initial values.. Since the paremeter slot such as alpha
is a matrix, and the element in the matrix can be the same or different. The number of parameters in the estimation varies depending on whether or not some of the elements in the initial setting are the same or different.
For example, if alpha[1,1]
and alpha[1,2]
in mhspec0
are different, the numerical procedure tries to estimate both parameters of alpha[1,1]
and alpha[1,2]
. If alpha[1,1]
and alpha[1,2]
are the same in the initial setting, then the estimation procedure considered two parameters are the same in the model and hence only one of them is estimated.
Recall that th example in the previous section is of a symmetric Hawkes model where alpha
is symmetric.
print(hspec2)
#> An object of class "hspec"
#> Slot "mu":
#> [,1]
#> [1,] 0.2
#> [2,] 0.2
#>
#> Slot "alpha":
#> [,1] [,2]
#> [1,] 0.5 0.9
#> [2,] 0.9 0.5
#>
#> Slot "beta":
#> [,1] [,2]
#> [1,] 2.25 2.25
#> [2,] 2.25 2.25
#>
#> Slot "dimens":
#> [1] 2
#>
#> Slot "rmark":
#> function (...)
#> 1
#> <bytecode: 0x000000001d5681f8>
#> <environment: 0x0000000016347bb0>
#>
#> Slot "dmark":
#> NULL
#>
#> Slot "impact":
#> NULL
#>
#> Slot "type_col_map":
#> NULL
<- hsim(hspec2, size=1000) res2
In the first example of estimation, alpha0
is a matrix where the all elements have the same value, 0.75. In this setting, mhfit
considers that alpha11 == alpha12 == alpha21 == alpha22
in the model (even though the actual parameters have different values). Similarly for other parmater matrix mu0
and beta0
.
<- matrix(c(0.15, 0.15), nrow = 2)
mu0 <- matrix(c(0.75, 0.75, 0.75, 0.75), nrow = 2, byrow=TRUE)
alpha0 <- matrix(c(2.6, 2.6, 2.6, 2.6), nrow = 2, byrow=TRUE)
beta0
<- new("hspec", mu=mu0, alpha=alpha0, beta=beta0)
hspec0 summary(hfit(hspec0, inter_arrival = res2$inter_arrival, type = res2$type))
#> --------------------------------------------
#> Maximum Likelihood estimation
#> BFGS maximization, 35 iterations
#> Return code 0: successful convergence
#> Log-Likelihood: -1282.818
#> 3 free parameters
#> Estimates:
#> Estimate Std. error t value Pr(> t)
#> mu1 0.19667 0.01535 12.82 <2e-16 ***
#> alpha1.1 0.62901 0.05876 10.70 <2e-16 ***
#> beta1.1 1.98320 0.19228 10.31 <2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> --------------------------------------------
In the second example, alpha0
’s elements are not same, but symmetric as in the original simulation. We have alpha11 == alpha22
and alpha11 == alpha22
in alpha0
and hence alpha11
and alpha12
will be estimated.
<- matrix(c(0.15, 0.15), nrow = 2)
mu0 <- matrix(c(0.75, 0.751, 0.751, 0.75), nrow = 2, byrow=TRUE)
alpha0 <- matrix(c(2.6, 2.6, 2.6, 2.6), nrow = 2, byrow=TRUE)
beta0
<- new("hspec", mu=mu0, alpha=alpha0, beta=beta0)
hspec0 summary(hfit(hspec0, inter_arrival = res2$inter_arrival, type = res2$type))
#> --------------------------------------------
#> Maximum Likelihood estimation
#> BFGS maximization, 31 iterations
#> Return code 0: successful convergence
#> Log-Likelihood: -1279.78
#> 4 free parameters
#> Estimates:
#> Estimate Std. error t value Pr(> t)
#> mu1 0.19704 0.01525 12.92 < 2e-16 ***
#> alpha1.1 0.51569 0.06654 7.75 9.16e-15 ***
#> alpha1.2 0.75801 0.08366 9.06 < 2e-16 ***
#> beta1.1 2.00977 0.18420 10.91 < 2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> --------------------------------------------
By simply setting reduced = FALSE
, all parameters are estimated (not recommended).
summary(hfit(hspec2, inter_arrival = res2$inter_arrival, type = res2$type, reduced=FALSE))
#> --------------------------------------------
#> Maximum Likelihood estimation
#> BFGS maximization, 51 iterations
#> Return code 0: successful convergence
#> Log-Likelihood: -1277.887
#> 10 free parameters
#> Estimates:
#> Estimate Std. error t value Pr(> t)
#> mu1 0.21174 0.01369 15.468 < 2e-16 ***
#> mu2 0.18064 0.02017 8.954 < 2e-16 ***
#> alpha1.1 0.46858 NA NA NA
#> alpha2.1 0.86952 0.13948 6.234 4.54e-10 ***
#> alpha1.2 0.74933 0.16026 4.676 2.93e-06 ***
#> alpha2.2 0.48571 NA NA NA
#> beta1.1 1.72933 NA NA NA
#> beta2.1 2.11391 0.33986 6.220 4.98e-10 ***
#> beta1.2 2.37364 0.68219 3.479 0.000502 ***
#> beta2.2 1.76635 NA NA NA
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> --------------------------------------------
In the following, impact
represents the impact of mark. It is a function, and the first argument is param
represents the parameter of the model. impact
function can have additional arguments such as alpha
, n
, mark
, etc., which represents the dependence. Do not miss ...
as the ellipsis is omitted, an error occurs.
<- matrix(c(0.15, 0.15), nrow=2)
mu <- matrix(c(0.75, 0.6, 0.6, 0.75), nrow=2, byrow=T)
alpha <- matrix(c(2.6, 2.6, 2.6, 2.6), nrow=2)
beta <- function(param = c(p=0.65), ...){
rmark rgeom(1, p=param[1]) + 1
}
<- function(param = c(eta1=0.2), alpha, n, mark, ...){
impact <- matrix(rep(mark[n]-1, 4), nrow = 2)
ma * ma * matrix( rep(param["eta1"], 4), nrow=2)
alpha #alpha * ma * matrix( c(rep(param["eta1"], 2), rep(param["eta2"], 2)), nrow=2)
}<- new("hspec", mu=mu, alpha=alpha, beta=beta,
h1 rmark = rmark,
impact=impact)
<- hsim(h1, size=1000, lambda0 = matrix(rep(0.1,4), nrow=2))
res
<- hfit(h1,
fit inter_arrival = res$inter_arrival,
type = res$type,
mark = res$mark,
lambda0 = matrix(rep(0.1,4), nrow=2))
summary(fit)
#> --------------------------------------------
#> Maximum Likelihood estimation
#> BFGS maximization, 38 iterations
#> Return code 0: successful convergence
#> Log-Likelihood: -1773.869
#> 5 free parameters
#> Estimates:
#> Estimate Std. error t value Pr(> t)
#> mu1 0.145327 0.008906 16.317 < 2e-16 ***
#> alpha1.1 0.677646 0.090884 7.456 8.91e-14 ***
#> alpha1.2 0.539642 0.076877 7.020 2.23e-12 ***
#> beta1.1 2.432946 0.280576 8.671 < 2e-16 ***
#> eta1 0.131363 0.091662 1.433 0.152
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> --------------------------------------------
In the basic model, alpha
is a matrix, but it can be a function as in the following code. The function alpha
simply return a \(4\times4\) matrix but by doing so, we can fix some of elements as specific vales when estimating. When estimating, the optimization only works for the specified parameters in param
. In the case of simulation, there is no difference whether the parameter set is represented by a matrix or a function.
<- matrix(c(0.02, 0.02, 0.04, 0.04), nrow = 4)
mu
<- function(param = c(alpha11 = 0, alpha12 = 0.3, alpha33 = 0.3, alpha34 = 0.4)){
alpha matrix(c(param["alpha11"], param["alpha12"], 0, 0,
"alpha12"], param["alpha11"], 0, 0,
param[0, 0, param["alpha33"], param["alpha34"],
0, 0, param["alpha34"], param["alpha33"]), nrow = 4, byrow = T)
}
<- matrix(c(rep(0.7, 8), rep(1.1, 8)), nrow = 4, byrow = T)
beta
<- function(param = c(alpha1n=0, alpha1w=0.1, alpha2n=0.1, alpha2w=0.2),
impact n=n, N=N, ...){
<- matrix(c(0, 0, param['alpha1w'], param['alpha1n'],
Psi 0, 0, param['alpha1n'], param['alpha1w'],
'alpha2w'], param['alpha2n'], 0, 0,
param['alpha2n'], param['alpha2w'], 0, 0), nrow=4, byrow=T)
param[
<- N[,"N1"][n] - N[,"N2"][n] > N[,"N3"][n] - N[,"N4"][n] + 0.5
ind
<- matrix(c(!ind, !ind, !ind, !ind,
km
ind, ind, ind, ind,
ind, ind, ind, ind,!ind, !ind, !ind, !ind), nrow = 4, byrow = T)
* Psi
km
}
<- new("hspec",
h mu = mu, alpha = alpha, beta = beta, impact = impact)
<- hsim(h, size=1000)
hr
<- hfit(h, hr$inter_arrival, hr$type)
fit summary(fit)
#> --------------------------------------------
#> Maximum Likelihood estimation
#> BFGS maximization, 82 iterations
#> Return code 0: successful convergence
#> Log-Likelihood: -2456.944
#> 12 free parameters
#> Estimates:
#> Estimate Std. error t value Pr(> t)
#> mu1 0.021343 0.002491 8.568 < 2e-16 ***
#> mu3 0.040149 0.003482 11.529 < 2e-16 ***
#> alpha11 -0.032145 0.021615 -1.487 0.1370
#> alpha12 0.365874 0.055751 6.563 5.28e-11 ***
#> alpha33 0.278210 0.040201 6.920 4.50e-12 ***
#> alpha34 0.456877 0.051018 8.955 < 2e-16 ***
#> beta1.1 0.934040 0.121411 7.693 1.44e-14 ***
#> beta3.1 1.097560 0.094567 11.606 < 2e-16 ***
#> alpha1n -0.038311 0.017128 -2.237 0.0253 *
#> alpha1w 0.163844 0.031140 5.261 1.43e-07 ***
#> alpha2n 0.164285 0.069767 2.355 0.0185 *
#> alpha2w 0.121636 0.060617 2.007 0.0448 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> --------------------------------------------