├── Makefile <- Makefile with commands like `make data` or `make train`
├── README.md <- The top-level README for developers using this project.
├── data
│ ├── external <- Data from third party sources.
...
...
@@ -41,23 +40,23 @@ Python == 3.8
Usage
==============================
To get up and running, clone the repository and copy the raw data to `air\data\raw\2020` or `air\data\raw\2019`, depending on your data version. Then go to do root project directory, install all modules and then install all requirements as follows:
To get up and running, clone the repository and copy the raw data to `air\data\raw\2021`. Then go to do the root project directory, install the src module and then all requirements as follows:
$ pip install -e .
$ pip install -r requirements.txt
Please note the project depend on scikit-survival, which requires Microsoft Visual C++ 14.0 or greater. Get it from: https://visualstudio.microsoft.com/visual-cpp-build-tools/
In the root project directory there is a client which can create datasets, make models and generate SHAP plots. It's a executable Python script. By default it will use the 2020 data version, encode the datasets as embeddings, not make visuzaliations of embeddings and not use the real ATS names, but their ISO id instead. To run the client:
In the root project directory there is a client which can create datasets. It's a executable Python script. By default it will use the 2021 data version and encode the categorial features of the datasets (the assitive aids data) as entity embeddings. To run the client:
$ python .\client.py -h
As an example, to run the client with the 2020 data version and with embeddings:
As an example, to run the client and encode the categorial features as one-hot-encoded columns: