Chemprop: Aggregation and Extrapolation
Context I have just finished a first version of my Master’s Thesis and waiting for my supervisor’s opinion. That’s why I again have tones of time for reading my favorite topics and seek some junior job position (ðŸ˜). Recently, I came across Dr. Pat Walters ’s blogs about the extrapolation capacity of Chemprop model. Part 1: https://patwalters.github.io/Why-Dont-Machine-Learning-Models-Extrapolate/ Part 2 (guest post from Dr. Alan Cheng and Jeffery Zhou ): https://patwalters.github.io/GNNs-Can-Extrapolate/ In Part 1, Dr. Walters suggested an interesting point: Chemprop models struggle with the extrapolation of Molecular Weights (MW). Specifically, models trained on compounds with MW below 400 g/mol have difficulty making predictions for those with MW higher than 500 g/mol. I refer to this as extrapolation on the target space, since it does not involve the input space, such as molecule clusters or chemical spaces, in this context. In the follow-up blog, Dr. Cheng and Zhou raised an even...