3.1 What datasets will be used to validate the AI model?
We take the welknow mnist for example. It was created by "re-mixing" the samples from NIST's original datasets. The creators felt that since NIST's training dataset was taken from American Census Bureau employees, while the testing dataset was taken from American high school students, it was not well-suited for machine learning experiments.
But it is ok for us to know where we can collect out ground truth data. The mnist dataset we prepared in the later chapter (hello-fl)was download from Pytorch official site. It's structure is as below.