There is a simple method to detect this: taking a model "claimed" to be trained scratch, taking the model you suspected is the original, generate a new model = claimed_model * 0.5 + suspected_model * 0.5.
If the claimed_model is trained from scratch, the new model will have 0 capability (basically generate gibberish words or noise). If it is a derivative of the suspected model, it will do something sensible.
It is a bit more interesting for diffusion model because you can fine-tune to a different objective, making this investigation harder to do, but not impossible.
If the claimed_model is trained from scratch, the new model will have 0 capability (basically generate gibberish words or noise). If it is a derivative of the suspected model, it will do something sensible.
It is a bit more interesting for diffusion model because you can fine-tune to a different objective, making this investigation harder to do, but not impossible.