* Statistical models (ETS, (V)ARIMA(X), etc)
* ML models (sklearn models, LGBM, etc)
* Many recent deep learning models (N-BEATS, TFT, etc)
* Seamlessly works on multi-dimensional series
* Models can be trained on multiple series
* Several models support taking in external data (covariates), known either in the past only, or also in the future
* Many models offer rich support for probabilistic forecasts
* Model evaluation is easy: Darts has many metrics, offers backtest etc
* Deep learning scales to large datasets, using GPUs, TPUs, etc
* You can do reconciliation of forecasts at different hierarchical levels
* There's even now an explainability module for some of the models - showing you what matters for computing the forecasts
* (coming soon): an anomaly detection module :)
* (also, it even include FB Prophet if you really want to use it)
Warning: I'm probably biased because I'm Darts creator.
"What library do I use?" is the wrong question. "What model do I use?" is the right question. Libraries are just part of the process of answering that question.
That said, high quality implementations of interesting times series models seem hard to come by, so it's still a legitimate question to ask about libraries. but consider the goal of asking about libraries: you want to find high-quality implementations of useful models, not a magic black box that you can crank data through.
As an example, imagine you want to calculate only a single sample into the future. Say furthermore that you have six input timeseries sampled hourly, and you don't expect meaningful correlation beyond 48h old samples.
You create 6x48 input features, take the single target value that you want to predict as output, and feed this into your run of the mill gradient boosted tree.
The above gives you a less complex approach than reaching for bespoke time-series stuff; I've personally have had success doing something like this.
If your regressor does not support multiple outputs, you can always wrap it in sklearns MultiOutputRegressor (or optionally RegressorChain; check it out). This is useful if, in the above example, you are not looking to predict only the next sample, but maybe the next 12 samples.
https://scikit-learn.org/stable/modules/generated/sklearn.mu...
Don't ask me what they do with all of these, I'm just the guy who make sure the forecast keeps being reproducible.
- Prophet - seems to be the current 'standard' choice
- ARIMA - Classical choice
- Exponential Moving Average - dead simple to implement, works well for stuff that's a time series but not very seasonal
- Kalman/Statespace model - used by Splunk's predict[1] command (pretty sure I always used LLP5)
I did some anomaly detection work, in business transactions, and found the best way was to create a sort of ensemble model, where we applied all the models, and kept any anomalies, then used simple rules to only alert on 'interesting' anomalies, like: - 2-3 anomalies in a row
- high deviation from expected
- multiple models all detected anomaly
To improve signal vs noise.[1] :https://docs.splunk.com/Documentation/Splunk/9.0.1/SearchRef...
Get some forecasts quick => FB Prophet. It's not as good as they'd have you believe, but it's fast and analysts can play with it to some extent.
Outlier detection => Hand-rolled C++ ETS framework.
Multilevel predictions and/or more complex tasks => That's where neural models start to have the edge, but at that point it's a costly project. I like simpler stuff to start, moving to the big guns if/when it's needed.
1. Time series based forecast based on revenue (the one OP is referring to). All the statistical time-series models come here. I primarily used H2O.ai for this.
2. Conversion based revenue forecast (input -> pipeline, output -> revenue). This proved to be quite tricky as there was a time lag between pipeline creation and revenue conversion
3. Delphi-method: Got the sales/pre-sales folks on-ground to predict a bottom-up number and used that as a forecast.
Finally, I combined them by applying weightages to the above approaches - based on how accurate they were on the test dataset.
IMHO, Like many of them have pointed out - the model/assumptions are more important than the library. The job of a data scientist is to make the prediction as reliable and explainable as possible.
Besides that, I also like statsmodels as the docs are pretty good.
https://tsfresh.readthedocs.io/en/latest/
https://www.sktime.org/en/v0.8.2/api_reference/auto_generate...
Jika mau merubah hidup anda dengn uang klik link
When modeling time series, you will want a model that is sensitive both to short term and longer term movements. In other words, a Long Term Short Term Memory (LSTM).
Sepp Hochreiter invented this concept in his Master's thesis supervised by Jürgen Schmidhuber in Munich in the 1990s; today, it's the most-cited type of neural network.
Here are papers describing it: https://people.idsia.ch/~juergen/rnn.html
In Python, you can use TensorFlow's LSTMCell class: https://www.datacamp.com/tutorial/lstm-python-stock-market