Siamese Neural Networks for Regression: Similarity-BasedPairing and Uncertainty Quantification

Detta är en Master-uppsats från Uppsala universitet/Institutionen för farmaceutisk biovetenskap

Författare: Yumeng Zhang; [2022]

Nyckelord: ;

Sammanfattning: Here we present a similarity-based pairing method for generating compound pairs to train a Siamese Neural Network. In comparison with the conventional exhaustive pairing of N2/2 pairs (N being the sizeof the training set), this method results in N-1 pairs, significantly reducing the training time. It exhibits a better prediction performance consistently on the three physicochemical property datasets, using a multilayer perceptron with the ECFP4 fingerprint. We further include into the Siamese Neural Network the pre-trained Chemformer which extracts task-specific chemical features from the input SMILES strings. With the n-shot learning, we propose a means to measure the prediction uncertainty. Our results demonstrate that the higher accuracy is indeed associated with the lower prediction uncertainty. In addition, we discuss implications of the similarity principle in machine learning.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)