City Research Online

DATE: Derivative Alignment Training for Extrapolation with Neural Networks

Lopedoto, E., Weyde, T. ORCID: 0000-0001-8028-9905 & Salako, K. ORCID: 0000-0003-0394-7833 (2024). DATE: Derivative Alignment Training for Extrapolation with Neural Networks. Paper presented at the SGAI BCS AI2024, 17-19 Dec 2024, Cambridge.

Abstract

In this work we introduce DAT E (Derivative Alignment Training for Extrapolation), a method to improve the extrapolation behaviour of neural networks (NN) with Rectified Linear Unit activation (ReLU) on univariate regression tasks. ReLU NNs naturally lend themselves to linear extrapolation beyond the training data range. However, there are two known limitations of extrapolation properties of trained ReLU NNs, that we address in this paper. When minimising the error of the prediction, the derivative of the NN model function can still show high variation, which can cause variable extrapolation. Non-linearities of the model function outside the training data range can lead to inconsistent extrapolation behaviour. In prior work, these issues have been addressed with a set of regularisation functions, called ReLEx. To address the issues named, we introduce two new regularisation terms: the D1-loss and the IR-loss. D1-loss directly penalises the deviation of the model derivative from a target derivative estimated from the data as the interpolation between neighbouring data points. The IR-loss penalises positions of the non-linearities of the ReLU units outside a given range. Optimising the combination of D1 with IR loss and/or some of the ReLEx functions constitutes the DAT E method. We evaluate DAT E on regression tasks with noiseless data generated from analytic functions. We test different DAT E configurations and find that training with DAT E can reduce the variability of the model slope, prevent nonlinearities outside the training data range, and improve extrapolation consistency as measured by different metrics.

Publication Type: Conference or Workshop Item (Paper)
Additional Information: This version of the contribution has been accepted for publication, after peer review but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. Use of this Accepted Version is subject to the publisher’s Accepted Manuscript terms of use https://www.springernature.com/gp/open-research/policies/accepted-manuscript-terms
Publisher Keywords: Extrapolation, Regression, Neural Networks, Derivative
Subjects: Q Science > QA Mathematics
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Departments: School of Science & Technology
School of Science & Technology > Computer Science
SWORD Depositor:
[thumbnail of SGAI_2024___Relex_and_Dloss___Lopedoto__Salako__Weyde (4).pdf] Text - Accepted Version
This document is not freely accessible due to copyright restrictions.

To request a copy, please use the button below.

Request a copy

Export

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Downloads

Downloads per month over past year

View more statistics

Actions (login required)

Admin Login Admin Login