*Issue #, if available:*
*Description of changes:* Update README.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
*Issue #, if available:* Fixes#193
*Description of changes:* Passing in contexts in lower precision than
float32 may result in a drop of accuracy. This change ensures that the
tokenizer (which does scaling and quantization) operates on a float32
batch.
Tested across GPU/CPU and different context dtypes with
```python
from itertools import product
import pandas as pd
import torch
from chronos import ChronosPipeline
import matplotlib.pyplot as plt # requires: pip install matplotlib
import numpy as np
df = pd.read_csv("https://raw.githubusercontent.com/AileenNielsen/TimeSeriesAnalysisWithPython/master/data/AirPassengers.csv")
for context_dtype, context_device, model_dtype, model_device in product(
[torch.bfloat16, torch.float16, torch.float32],
["cpu"], # only cpu input supported at the moment
[torch.bfloat16, torch.float16, torch.float32],
["cpu", "cuda"],
):
pipeline = ChronosPipeline.from_pretrained(
"amazon/chronos-t5-tiny",
device_map=model_device,
torch_dtype=model_dtype,
)
forecast = pipeline.predict(
context=torch.tensor(df["#Passengers"]).to(dtype=context_dtype, device=context_device),
prediction_length=65,
num_samples=20,
limit_prediction_length=False,
)
assert forecast.dtype == context_dtype, f"{forecast.dtype=} but {context_dtype=}"
assert str(forecast.device) == context_device, f"{forecast.device=} but {context_device=}"
forecast_index = range(len(df), len(df) + 65)
low, median, high = np.quantile(forecast[0].to(device="cpu", dtype=torch.float32).numpy(), [0.1, 0.5, 0.9], axis=0)
plt.figure(figsize=(8, 4))
plt.plot(df["#Passengers"], color="royalblue", label="historical data")
plt.plot(forecast_index, median, color="tomato", label="median forecast")
plt.fill_between(forecast_index, low, high, color="tomato", alpha=0.3, label="80% prediction interval")
plt.legend()
plt.grid()
plt.show()
```
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
Fixes https://github.com/amazon-science/chronos-forecasting/issues/181.
Chronos' tokenizer has a vocabulary size of `n_tokens`. Among these,
there are `n_special_tokens` reserved for EOS, PAD, etc. and `n_tokens -
n_special_tokens` allocated to numerical values. However, the provided
`MeanScaleUniformBins` tokenizer creates` n_tokens - n_special_tokens +
1` different buckets, resulting in a total of `n_tokens + 1` possible
tokens. This causes training and inference errors when one of the data
points gets allocated to the largest bucket, as the model requires 0 <=
token_id < n_tokens.
This PR modifies the `MeanScaleUniformBins` tokenizer, so that it
creates one less bucket for numerical values.
---
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
---------
Co-authored-by: Lorenzo Stella <lorenzostella@gmail.com>
*Issue #, if available:*
*Description of changes:*
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
*Issue #, if available:* Fixes#154
*Description of changes:* Prior to the fix, some workers have no dataset
to consume if `dataloader_num_workers > len(training_data_paths)`.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
*Description of changes:* Adds generation params to command line options
for the evaluation script.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
*Description of changes:* Title.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
*Description of changes:* This PR updates README.md with dataset and
evaluation details
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
*Description of changes:* This PR adds configs and a script to evaluate
Chronos models in the same way as described in the paper.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
Co-authored-by: Abdul Fatir Ansari <ansarnd@amazon.de>
*Description of changes:* This PR sets `drop_prob = 0` when training
causal models. Missing values are problematic for causal model training.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
*Description of changes:* This PR adds support for training
causal/decoder-only models.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
Co-authored-by: Abdul Fatir Ansari <ansarnd@amazon.de>
*Description of changes:* adding templates for GitHub issues.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
*Description of changes:* Removes print statements that got left inside
from a debugging session.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
*Description of changes:* Run CI at 8 AM UTC every day.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
Co-authored-by: Lorenzo Stella <stellalo@amazon.com>
*Description of changes:* Automatically set `tf32` to `False` if used on
an older NVIDIA GPU. Reorder seed so that the seed is saved as part of
the training config.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
---------
Co-authored-by: Abdul Fatir Ansari <ansarnd@amazon.de>
*Description of changes:* Updates the citation.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
*Description of changes:* This splits `input_transform` into
`context_input_transform` and `label_input_transform`. Previously,
`input_transform` was being used for both context and label during
training which would lead to incorrect results where `prediction_length`
> `context_length`.
TODO:
- [x] Update docstrings
- [x] Test the training script
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
---------
Co-authored-by: Abdul Fatir Ansari <ansarnd@amazon.com>
*Description of changes:* This PR relaxes `torch` and `transformers`
versions to allow for older versions that were used during original
training. This is needed in light of recent `torch`/`transformers`
versions being slower with DDP.
Relevant issues (but the problem may be deeper than these):
- https://github.com/huggingface/transformers/issues/30840
- https://github.com/pytorch/pytorch/issues/127077
- https://github.com/NVIDIA/nccl/issues/1298
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
Co-authored-by: Abdul Fatir Ansari <ansarnd@amazon.com>
*Description of changes:* This PR updates the training script to also
save the training details in the final checkpoint.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
---------
Co-authored-by: Abdul Fatir Ansari <ansarnd@amazon.de>
*Issue #, if available:* #76
*Description of changes:* transformers 4.41 broke something for us, we
need to look into it deeper
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
*Description of changes:*
The bin indexes were shifted by one between input transform and output
transform. Subtracting 1 to the sampled tokens in output transform lead
to the correct reconstruction of the signal.
Add a test to ensure the consistency of the Chronos Tokenizer.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
Co-authored-by: Lorenzo Stella <stellalo@amazon.com> and Abdul Fatir
Ansari <ansarnd@amazon.com>
*Description of changes:* Looks better in dark mode.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
Co-authored-by: Abdul Fatir Ansari <ansarnd@amazon.com>
*Issue #, if available:*
*Description of changes:*
There is one space missing in the example training command.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
Co-authored-by: Ubuntu <ubuntu@ip-172-31-43-83.us-west-2.compute.internal>
*Description of changes:* Adds details to the Readme on how to push a
fine-tuned model to HF Hub.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
---------
Co-authored-by: Abdul Fatir Ansari <ansarnd@amazon.de>
*Description of changes:* Simplifies content in the "Usage" section, fix
a link.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
*Description of changes:* Adds usage examples for `scripts/`.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
---------
Co-authored-by: Abdul Fatir Ansari <ansarnd@amazon.com>
*Description of changes:* See title.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
Co-authored-by: Abdul Fatir Ansari <ansarnd@amazon.com>
*Description of changes:* This PR adds the script to generate synthetic
data from KernelSynth.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
---------
Co-authored-by: Abdul Fatir Ansari <ansarnd@amazon.com>
*Description of changes:* Add training script and config files. Can be
used for pre-training, or adapted for fine-tuning chronos models.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
---------
Co-authored-by: Abdul Fatir <Abdulfatirs@gmail.com>
*Issue #, if available:*
*Description of changes:*
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
---------
Co-authored-by: Abdul Fatir <Abdulfatirs@gmail.com>
*Description of changes:* This PR revamps the README.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
---------
Co-authored-by: Abdul Fatir Ansari <ansarnd@amazon.com>
*Description of changes:* Adds `CITATION.cff` file so that we can get
the "Cite this repository" option in the sidebar.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
Co-authored-by: Abdul Fatir Ansari <ansarnd@amazon.com>
*Issue #, if available:* #28 (also, PR #41)
*Description of changes:* This PR updates the README with information on
MLX support.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
*Description of changes:* Minor simplification to how the tokenizer is
constructed from the config
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
*Description of changes:* Speed up GH workflow by installing CPU-only
version of torch
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
*Description of changes:* Fix some type checking issues, add mypy to
github workflow, apply black
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
*Description of changes:* This PR adds `pipeline.embed` which extracts
encoder embeddings from the model. These embeddings may be useful for
some downstream tasks such as classification, so this is useful to have.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
---------
Co-authored-by: Abdul Fatir Ansari <ansarnd@amazon.de>
*Issue #, if available:* Unnecessary context padding slows down
inference. We evaluated the models from HF with this change, and found
no concerning issue with accuracy.
Test code for a context of length 200:
```python
import torch
from chronos import ChronosPipeline
import time
pipeline = ChronosPipeline.from_pretrained(
"amazon/chronos-t5-large",
device_map="cuda",
torch_dtype=torch.bfloat16,
)
context = torch.ones((8, 200))
prediction_length = 24
num_runs = 10
t0 = time.time()
for _ in range(num_runs):
forecast = pipeline.predict(
context,
prediction_length,
num_samples=20,
)
t1 = time.time()
print(f"total time: {t1 - t0}")
```
Before the change:
```
total time: 20.005481481552124
```
After the change:
```
total time: 9.82350754737854
```
*Description of changes:* Remove padding in case the provided batch is
shorter than `context_length`.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
*Description of changes:* This PR adds optional inference params such as
`num_samples`, `top_k`, etc. to the example in the README for clarity.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
*Issue #, if available:* N/A
*Description of changes:*
Thanks for the very clean impl of the Model, Tokenizer, and Pipeline.
I was curios about it and found a minor improvement in the API - what do
you think about it? Feel free to close. Change is untested.
By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.