PointProcessTools.jl
Exported data types
PointProcessTools.Record
— TypeRecord
events::Vector{<:AbstractFloat}
-> Event history of the processinterval::Interval
-> Time window of the observation
Represents an observed process. Event times represent time before present, therefore the present in represented as 0 and positive values represent past events. If negative values are passed, they will be converted to positive.
Implemented functions: show
, events_to_array
, n_events
, length
, interval
, start
, finish
, getindex
.
The events can either be in a csv file and the path to it provided to the constructor or passed directly as a Vector
. For a csv file, first column must contain the times of the events and it must contain a header. Look at the folder Resources/data
for an example.
Record("Resources/data/record.csv")
Record([1, 4, 5, 5.5, 8, 8.9])
An interval can be provided in the last positions.
Record("Resources/data/record.csv", "Resources/data/proxy.csv", 100000, period=10000)
Record([1, 4, 5, 5.5, 8, 8.9], [0, 2, 4, 6, 8, 10], [1, 2, 5, 4, 7, 3], 200000, 500000)
For csv files, if the file contains addtional categorical columns, keyword arguments can be passed to filter the desired events. Provide the pair col=[val1, val2, ..., valn]
to keep only events whose value in column col
is any of the values val1
, val2
, ..., valn
.
Record("Resources/data/record.csv", Composition=["Mafic", "Bimodal"])
PointProcessTools.Proxy
— FunctionProxy
Is a continuous function constructed by linearly interpolating the proxy values. The actual data type of a Proxy is Interp
, which is an alias for Interpolations. Extrapolation
(from the Interpolations
package), but the Proxy
function acts as a constructor.
Implemented functions: show
, minimum
, maximum
, get_xs
, get_ys
, -
, interval
, integral
, ∫
.
To initialize, a path to a csv file can be provided. The file must contain the times of each observation in the first column and the values of each observation in the second. Look in the Resources/data
folder for an example.
Proxy("Resources/data/proxy.csv")
There are two keyword arguments: period
and shift
.
period
is for calculating the finite central difference of the function.
shift
is for shifting the function backwards or forwards.
-Proxy("Resources/data/proxy.csv", period=10000, shift=5000)
Instead of a file, is possible to initialize a Proxy
with the values columns of the file as Vector
s.
-Proxy([0:1000:1000000], rand(1000))
Optionally, an interval can be passed too. Either one number, representing the total time span of the proxy, or two number, representing the start and end of the interval where the proxy is defined. Interval
.
Proxy("Resources/data/proxy.csv", 500000)
Proxy("Resources/data/proxy.csv", 100000, 300000)
PointProcessTools.Parameters
— TypeAbstract type for dispatching on the type of parameters.
Used in CIF
, simulate
, time_transform
.
Implemented methods for: length
, collect
and show
.
Possible concrete subtypes are:
ParametersHP
-> Homogeneous Poisson | 1 parameter | μParametersIP
-> Inhomogeneous Poisson | 2 parameters | μ, γParametersHH
-> Homogeneous Hawkes | 3 parameters | μ, α, βParametersIH
-> Inhomogeneous Hawkes | 4 parameters | μ, γ, α, β
Notice that all values must be non-negative.
Mostly initialized by the estimate
function. But can be initialized by providing the corresponding parameters.
rec = Record("Resources/data/record.csv")
params_hp = estimate("hp", rec) # ParametersHP
params_hh = PointProcessTools.ParametersHH(5e-5, 5e-5, 8e-5)
Exported functions
PointProcessTools.fit_test
— FunctionPerforms the goodness-of-fit test for a given model and distance function. The number of simulations used in the bootstrap can be set with the n_sims
keyword argument (default 1000).
The model
may be provided either as a string or as an instance of the model type. The dist
may be provided either as a string or as an instance of the distance type.
For inhomogeneous processes, the proxy
field of rec
must not be nothing
.
Returns a named tuple with fields:
p
-> returned p-value of the testsim_dists
-> simulated distances used in calculating the p-valuedist
-> distance between the observed and the estimated processparams
-> estimated parameters of the model
See Record
distance
Model
simulate
estimate
rec = Record("Resources/data/record.csv")
fit_test("hp", "ks", rec) # Perform the goodness-of-fit test for a Poisson process using the KS-distance
fit_test("hh", "lp", rec; n_sims=1000) # Perform the test for a Hawkes process using the Laplace distance.
PointProcessTools.estimate
— FunctionEstimate the parameters of an observed process as one of the supported models.
The 'model' may be provided either as a string or as an instance of the model type.
For the homogeneous Poisson model, the maximum likelihood estimator (MLE) can be calculated directly (Laub (2021)).
For inhomogeneous Poisson, the MLE is approximated using a newtonian optimization method.
For both variants of the Hawkes process, the MLE is approximated with an expectation maximization (EM) algorithm (E. Lewis, G. Mohler (2011)). Returns an instance of Parameters
.
See Model
, Parameters
, Record
.
rec = Record("Resources/data/record.csv")
estimate("hp", rec) # Estimate the parameters for a Poisson process
estimate("hh", rec) # Estimate the parameters for a Hawkes process
PointProcessTools.simulate
— FunctionSimulate one realization of a point process with the given parameters and interval.
This function is dispatched on the type of 'params'. For inhomogeneous processes, a Proxy
object is required to provide the intensity function.
Returns a vector containing the event times.
See Parameters
, Record
, Proxy
.
simulate(ParametersHP(1), 0, 100) # Simulates a Poisson process with unit intensity over [0, 100]
params_ih = ParametersIH(1, 4, 2, 4)
proxy = Proxy(collect(0:100), log.(0:100))
simulate(params_ih, 0, 100, proxy) # Simualte a Inhomogeneous Hawkes process
PointProcessTools.CIF
— FunctionReturn the conditional intensity function of a point process as a Proxy
.
If only a model
and a Record
are passed, calculates the CIF of the process with parameters estimated from Record
.
The specific model (Homogeneous Poisson, Inhomogeneous Poisson, Hawkes, etc...) is determined by the type of params
.
See Parameters
Record
Proxy
rec = Record("Resources/data/record.csv", "Resources/data/proxy.csv")
CIF("hp", rec) # CIF of a Poisson process with parameters estimated from `rec`
CIF(ParametersIP(1e-4, 2e-2), rec) # CIF of an Inhomogeneous Poisson process
PointProcessTools.AIC
— FunctionCalculates the Akaike Information Criterion for a given record and parameters.
This function is dispatched on the type of 'params'.
See likelihood
Parameters
[Record
]@ref.
PointProcessTools.periodicities
— FunctionCalculates the fourier transform of a point process. This method uses the structure of the data to improve the speed and precision of the calculations. Each event is represented as a shifted dirac delta function, so an event that occurred at time t₀ is represented as δ(t - t₀). Since the Fourier transform is linear and we know that the Fourier transform of δ(t - t₀) is F(ω) = exp(-iωt₀), where ω is the frequency, we can just sum these functions for all different t₀ in the event history. This allows the computation of only specific components, speeding up the process. See M. Bartlett (1963). Statistical Estimation of Density Function
Calculates the equivalent of the fourier transform, but for specific chosen periodicities. It is simply the sum of the complex exponential with the chosen frequency calculated at the times of each of the events in the event record.
PointProcessTools.simulate_periodicities
— FunctionSimulates records from a chosen model (see Model
), calculates the Fourier transform and returns the power of each frequency component for each simulation.
PointProcessTools.period_function
— FunctionConstructs the sine wave corresponding to a component from the Fourier transform. This function is not scaled with the magnitude of the component. The scaling can be done by simply multiplying the function by the magnitude of the component and dividing by the length of the record.
Non exported data types
PointProcessTools.Model
— TypeAbstract type for dispatching on the type of model.
Used in CIF
, simulate
, time_transform
.
Possible concrete subtypes:
HP
-> Homogeneous PoissonIP
-> Inhomogeneous PoissonHH
-> Homogeneous HawkesIH
-> Inhomogenous Hawkes
Non exported functions
PointProcessTools.distance
— FunctionCalculates the distance between the empirical distribution of the interarrival times and an unit exponential distribution. It assumes transf_events
are the time transformed event times.
'dist' must be either "KS" for the Kolmogorov-Smirnov distance, or "Lp" for the distance using the Laplace transform (not case sensitive).
See time_transform
.
rec = Record("Resources/data/record.csv")
params_hh = ParametersHH(1e-4, 1e-4, 3e-4)
distance("ks", time_transform(params_hh, rec))
distance("Lp", time_transform(params_hh, rec))
For the Kolmogorov-Smirnov distance, the Wikipedia article is sufficient.
For the Laplace distance, see this paper.
PointProcessTools.time_transform
— FunctionReturns the time transformed event history of the process given the conditional intensity function calculated with respect to the given parameters.
The returned times are always in the interval from 0 to the time transform of the end of the difinition interval, which is returned as the last element of the vector.
Used in distance
for calculating the KS or Laplace distances.
The function is dispatched based on the type of 'params'.
See Parameters
Record
rec = Record("Resources/data/proxy.csv")
params = ParametersHH(1e-4, 1e-4, 2e-4)
time_transform(params, rec)
PointProcessTools.likelihood
— FunctionCalculates the likelihood of as observed process with respect to the given parameters.
This function is dispatched on the type of 'params'.
See Parameters
[Record
]@ref.
Tests
PointProcessTools.test_simulation
— FunctionTests the simulation algorithm for a given model and record.
The function simulates the model n_sims
times with the parameters estimated from the provided Record
and compares the expected cummulative number of events over the process interval with the average cummulative number of events generated by the simulations.
Instead of a model as a first argument, it is possible to provide parameters.
The function plots both curves and their difference to visually compare if there is a significantdiscrepancy between the two.
The function is dispatched based on the type of 'params'. See simulate
Parameters
Record
rec = Record("Resources/data/record.csv")
test_simulation("hh", rec)
test_simulation(ParametersHP(1e-4), rec, n_sims=1000, plot_results=true)
PointProcessTools.test_estimation
— FunctionTests the estimation algorithm for a given model and record.
The function simulates n_estimations
processes with parameters estimated from the provided Record
and estimates the parameters from the simulations.
Instead of a model as a first argument, it is possible to provide parameters.
The function then plots the histogram of the distribution of the estimated parameters and the true parameters.
The function is dispatched based on the type of 'params'.
See estimate
Parameters
Record
rec = Record("Resources/data/record.csv")
test_estimation("hh", rec)
test_estimation(ParametersHP(1e-4), rec; n_estimations=1000, plot_results=true)
PointProcessTools.test_fit_test
— FunctionTests the goodness of fit algorithm for a given model and record.
Given a specific model_type
and dist_type
, the function estimates the parameters of rec
as the given model type and simulates n_tests
processes with these parameters. The function then runs the goodness-of-fit test on each of these simulations and collects the n_tests
p-values.
The p-values are returned and, if the plot_results
keyword is set to true, the function plots the distribution of the p-values. A distribution of p-values close to uniform means that the test is working correctly.
See fit_test
Model
distance
Record
rec = Record("Resources/data/record.csv", "Resources/data/proxy.csv")
fit_test("ip", "ks", rec) # Test the goodness-of-fit test for Poisson
fit_test("ih", "lp", rec; n_sims=1000, n_tests=1000, plot_results=true)
Index
PointProcessTools.Model
PointProcessTools.Parameters
PointProcessTools.Record
PointProcessTools.AIC
PointProcessTools.CIF
PointProcessTools.Proxy
PointProcessTools.distance
PointProcessTools.estimate
PointProcessTools.fit_test
PointProcessTools.likelihood
PointProcessTools.period_function
PointProcessTools.periodicities
PointProcessTools.simulate
PointProcessTools.simulate_periodicities
PointProcessTools.test_estimation
PointProcessTools.test_fit_test
PointProcessTools.test_simulation
PointProcessTools.time_transform