City Research Online - DATE: Derivative Alignment Training for Extrapolation with Neural Networks

DATE: Derivative Alignment Training for Extrapolation with Neural Networks

Lopedoto, E., Weyde, T. ORCID: 0000-0001-8028-9905 & Salako, K. ORCID: 0000-0003-0394-7833 (2024). DATE: Derivative Alignment Training for Extrapolation with Neural Networks. In: Goebel, R., Wahlster, W. & Zhou, Z-H. (Eds.), Artificial Intelligence XLI. SGAI BCS AI2024, 17-19 Dec 2024, Cambridge.

Abstract

In this work we introduce DAT E (Derivative Alignment Training for Extrapolation), a method to improve the extrapolation behaviour of neural networks (NN) with Rectified Linear Unit activation (ReLU) on univariate regression tasks. ReLU NNs naturally lend themselves to linear extrapolation beyond the training data range. However, there are two known limitations of extrapolation properties of trained ReLU NNs, that we address in this paper. When minimising the error of the prediction, the derivative of the NN model function can still show high variation, which can cause variable extrapolation. Non-linearities of the model function outside the training data range can lead to inconsistent extrapolation behaviour. In prior work, these issues have been addressed with a set of regularisation functions, called ReLEx. To address the issues named, we introduce two new regularisation terms: the D1-loss and the IR-loss. D1-loss directly penalises the deviation of the model derivative from a target derivative estimated from the data as the interpolation between neighbouring data points. The IR-loss penalises positions of the non-linearities of the ReLU units outside a given range. Optimising the combination of D1 with IR loss and/or some of the ReLEx functions constitutes the DAT E method. We evaluate DAT E on regression tasks with noiseless data generated from analytic functions. We test different DAT E configurations and find that training with DAT E can reduce the variability of the model slope, prevent nonlinearities outside the training data range, and improve extrapolation consistency as measured by different metrics.

Publication Type:	Conference or Workshop Item (Paper)
Additional Information:	This version of the contribution has been accepted for publication, after peer review but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. Use of this Accepted Version is subject to the publisher’s Accepted Manuscript terms of use https://www.springernature.com/gp/open-research/policies/accepted-manuscript-terms
Publisher Keywords:	Extrapolation, Regression, Neural Networks, Derivative
Subjects:	Q Science > QA Mathematics Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Departments:	School of Science & Technology School of Science & Technology > Computer Science
SWORD Depositor:	Symplectic Administrator

[thumbnail of SGAI_2024___Relex_and_Dloss___Lopedoto__Salako__Weyde (4).pdf]

Text - Accepted Version
This document is not freely accessible until 29 November 2025 due to copyright restrictions.

To request a copy, please use the button below.

Request a copy

Export

Downloads

Downloads per month over past year

View more statistics

Metadata

Altmetric

CORE (COnnecting REpositories)

Actions (login required)

Admin Login

Creators:	Lopedoto, E. Weyde, T. ORCID: 0000-0001-8028-9905 Salako, K. ORCID: 0000-0003-0394-7833
Event Title:	SGAI BCS AI2024
Event Type:	Conference
Event Location:	Cambridge
Event Dates:	17-19 Dec 2024
Status:	Published
Refereed:	Yes
ISBN:	978-3-031-77914-5
ISSN:	2945-9133
e-ISSN:	2945-9141
URI:	https://openaccess.city.ac.uk/id/eprint/33866
Date available in CRO:	16 Oct 2024 14:41
Date deposited:	16 October 2024
Dates:	Date Event 29 November 2024 Published 29 November 2024 Published Online 30 August 2024 Accepted