[1P-491] Does the optimal structure of Vision Transformer (ViT) demonstrate a universal scaling law in accordance with the scale of the training data?
Keywords:deep learning, image learning, neural network, diffusion models
Please log in with your participant account.
» Participant Log In