2.3 The input and output interfaces

To fit the custom program into our platform, it's nothing more important than the input / output interfaces. The needed custom materials include:

Dataset
Training / Validation code
Model weights (Optional for FL but must required for FV)

Since dataset belongs to Co-PI while code and model weights belongs to PI. If you are both Co-PI and PI, please separates the files into different folders. Here are some examples, note that you do not have the orange part initially, so let's prepare them step-by-step.

Example-1

CoPI

|---dataset

| |---img1.jpg

| |---img2.jpg

| |---labels.csv

|---dataset.zip # from the compressing of dataset

|---main.py # custom training / validation code

|---requirements.txt # the packages that main.py needed and Flavor

|---Dockerfile # script for building an image that contains main.py and has packages in requirements.txt installed

Example-2

CoPI

|---data.csv

|---data.csv.tar # from the compressing of data.csv

|---main.py # custom training / validation code

|---requirements.txt # the packages that main.py needed and Flavor

|---Dockerfile # script for building an image that contains main.py and has packages in requirements.txt installed

Dataset

Compress dataset into a file (.zip, .tar or .tar.gz)

The dataset can be any format or even multiple files, but it should be a compressed file in ".zip" or ".tar" or ".tar.gz".
The platform will decompress the file according to its extension by a corresponding command under a folder called INPUT_PATH which will be introduced later. For instance,
- In Example-1, after CoPI upload "dataset.zip", the system will do "cd $INPUT_PATH && unzip dataset.zip". So the final path will be "$INPUT_PATH/dataset/*.jpg".
- In Example-2, after CoPI upload "data.csv.tar", the system will do "cd $INPUT_PATH && tar -xf dataset.zip". So the final path will be "$INPUT_PATH/data.csv".
The extracted file structure is possible to be different if we use different decompressing tools. Note that here we use Linux commands to extract. Please confirm the decompressed file structure in advance. For example, a file "foo.csv" is compressed into "foo.zip", then:
- "double clicking foo.zip in MacOS" might generate "./foo/foo.csv".
- "unzip foo.zip in Ubuntu" will generate "./foo.csv".

Code

1. Convert .ipynb to .py

If your code is in Jupyter Notebook format ".ipynb", please follow the below instructions to convert it into python script format ".py". Although ".ipynb" is friendly for developing due to its storage of both input code and output result, python scripts ".py" that stores input code only are more widely used for deployment because of its lightness and stability. Please choose one of the following method to convert the code:

Method 1 - Copy paste directly
- Create an empty text. e.g. xxx.txt
- Rename its extension as "py". e.g. xxx.py
- Copy code in each block from ".ipynb" to ".py" sequentially.
Method 2 - Use commands (Recommended)

pip install jupyter nbconvert

jupyter nbconvert --to script xxx.ipynb

2. Standard machine processes

Please ensure that your training or validation code should be conformed with standard machine learning processes (or it should be rearranged first) and make sure it runs properly.

Training code (FL):
- Load data
- Initialize a model
- (Optional) Load pre-trained weights into the model
- Training # Please make sure the result is converged by certain epochs
Validation code (FV):
- Load data
- Initialize a model
- Load pre-trained weights into the model
- Validating

3. Models inputs and outputs

There is NO limitation of model inputs and outputs. The only required interface is about the code, see the next item.

4. Input interfaces and output Interfaces (IMPORTANT)

There are three things we need to do

Modify code to fit input interface
Modify code to fit output interface
Pack the modified code into a docker image

Here we provide a library called FLaVor to complete these requirements.

For FL, please follow 3.1 An overview of FLaVor FL to complete the code interface.

For FV, please follow 4.1 An overview of FLaVor FV to complete the code interface.

Weights

Pytorch: xxx.ckpt
- type(model): torch.nn.Module
- {"state_dict": model.state_dict()}
- torch.save
Tensorflow: xxx.ckpt
- type(model): tensorflow.keras.models.Model
- {"state_dict": {str(key): value for key, value in enumerate(model.get_weights())}}
- pickle.dump
XGBoost: xxx.json
- type(model): xgboost.Booster
- model
- xgboost.save_model

Note that if you have multiple models weights, you can upload several weights into a project and choose one for a plan.