[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Guidelines for pre-trained ML model weight binaries (Was re: Where s
From: |
Kyle |
Subject: |
Re: Guidelines for pre-trained ML model weight binaries (Was re: Where should we put machine learning model parameters?) |
Date: |
Thu, 06 Apr 2023 13:41:40 +0000 |
>Since it is computing, we could ask about the bootstrap of such
>generated data. I think it is a slippery slope because it is totally
>not affordable to re-train for many cases: (1) we would not have the
>hardware resources from a practical point of view,, (2) it is almost
>impossible to tackle the source of indeterminism (the optimization is
>too entailed with randomness).
I have only seen situations where the optimization is "too entailed with
randomness" when models are trained on proprietary GPUs with specific settings.
Otherwise, pseudo-random seeds are perfectly sufficient to remove the
indeterminism.
=>
https://discourse.julialang.org/t/flux-reproducibility-of-gpu-experiments/62092
Many people think that "ultimate" reproducibility is not a practical either.
It's always going to be easier in the short term to take shortcuts which make
conclusions dependent on secret sauce which few can understand.
=> https://hpc.guix.info/blog/2022/07/is-reproducibility-practical/
From my point of view, pre-trained
>weights should be considered as the output of a (numerical) experiment,
>similarly as we include other experimental data (from genome to
>astronomy dataset).
I think its a stretch to consider a data compression as an experiment. In
experiments I am always finding mistakes which confuse the interpretation
hidden by prematurely compressing data, e.g. by taking inappropriate averages.
Don't confuse the actual experimental results with dubious data processing
steps.