AI and Sufficiency – How Much Should Be Disclosed?

It is well appreciated in the field of Artificial Intelligence and Machine Learning that high quality training data is key to the ability of the trained algorithm to act as an effective classifier. The European Patent Office (EPO) has previously issued guidance to the effect that technical character can be gained based on training data and it is therefore not surprising that the EPO Board of Appeal had cause in T 0161/18 (Äquivalenter Aortendruck/ARC SEIBERSDORF) to consider an AI invention which relied upon training data to provide an inventive step. In this decision the Board found that a failure to disclose input data for training the algorithm or at least one sample training data set led to a lack of sufficiency. This decision provides useful guidance on the level of disclosure required to satisfy this requirement at the EPO for an AI invention, and perhaps an indication that AI applications may be subjected to the pharmaceutical sector’s data requirements.

The invention in T 0161/18 related to use of an artificial neural network to apply a transformation to a blood pressure curve. The application as filed discussed the training of the neural network and was relatively general about the input data required, stating that it should cover a wide range of patients with diverse characteristics to avoid over-specialisation of the neural network.

The Board found that the skilled person was not provided with enough information to put the invention into effect. Specifically, without more information about the input data, the skilled person was essentially left to determine the input data themselves which was considered to constitute an ‘undue burden’ and potentially to also require inventive skill. For this reason, the application was refused as insufficient.

In discussing the refusal the Board noted that the application as filed did not disclose either which input data are suitable for training the neural network (e.g. a feature vector of training data) or a sample training data set. This suggests that the Board would have accepted either as satisfying the requirements of sufficiency. This is a helpful indication as it provides useful guidance of the level of disclosure that is required to meet the requirements of sufficiency at the EPO in connection with the training of AI inventions. Such sufficiency issues are well understood in the pharmaceutical field, where the inclusion of data in an application is key in establishing that the invention can be reduced to practice.

There is a practical point to consider here – namely, often an AI algorithm and the associated training methodology can remain secret in a commercial deployment, e.g. in a Software as a Service-type model in which both the AI algorithm and training methodology can remain entirely opaque to the end client. Including training data in a patent application in this scenario causes information to become publically available that could otherwise remain secret, potentially tipping the scales towards retaining the AI invention as trade secret rather than applying for a patent.

It is also worth noting that even once the sufficiency requirement is met there is still inventive step to consider. The applicant sought to rely upon the neural network as the inventive feature of the claims, arguing that the closest prior art did disclose a method for applying a transformation to a blood pressure curve but did not disclose a neural network capable of applying this transformation. The applicant further argued that the claimed neural network was inventive because it achieved a technical effect of guaranteeing a precise output at modest computational cost.

The Board was not convinced by this line of argumentation, stating that without more detailed information on the training of the neural network it was not credible that the technical effect relied upon by the applicant was achieved across the entire scope claimed. The Board instead considered the claims, as interpreted based on the description, to encompass a neural network with an unspecified set of weights – i.e. including an untrained neural network that could not be said to achieve the technical effect relied upon by the applicant. It seems that even some evidence of the technical effect being achieved would not have helped here, unless it was present in combination with details of the training methodology or trained model.

Compounding this issue is the fact that AI algorithms per se are treated as mathematical methods by the EPO – a category excluded from patentability – meaning that to rely upon an AI algorithm for inventive step, it is necessary to direct claims to use of the AI algorithm for a particular technical purpose. Transformation of a blood pressure curve can in principle provide this technical purpose but only in the case where the AI algorithm can be said to credibly achieve this transformation. The breadth at which the AI algorithm was claimed in this case led to embodiments being encompassed by the claims that did not credibly enable this transformation, leading to an inability to rely upon the AI algorithm for inventive step due to the further reason that it was (at least partially) excluded from patentability. For more information on this, see this segment of our ‘Machine Learning at the EPO’ series on our YouTube channel.

Conclusion

This decision indicates that providing one sample of training data in an application should be enough to meet the requirements of sufficiency. However, it is likely that this would not support an inventive step for the entire scope of a claim that is significantly broader than the specific data set disclosed. In practice it is very likely that much broader scope than one particular training data set will be desired and therefore more should be included at the drafting stage than just one training data set.

It would seem that setting out multiple training data sets in the application could be used to justify inventiveness of a broader claim scope, as is the case for additional compound examples in pharmaceutical applications. However, this still may not be enough to obtain the claim scope that is desired.

Instead, to maximise chances of obtaining broad protection, applicants should consider whether it is possible to link the technical effect(s) that they wish to rely upon for inventive step to a broader statement of the invention, e.g. statements such as ‘a training data feature vector that contains at least features X, Y and Z will achieve the technical effect’, or other such more generic representations of input data for training an AI algorithm. This type of disclosure would combine naturally with one or more training data sets in which the requisite elements are present, and indeed would arguably support inventive step more effectively than a disclosure of any number of training data sets. Moreover T 0161/18 shows that this formulation should satisfy the requirements of sufficiency even without any training data set disclosure such that all bases are covered.

AI and Sufficiency – How Much Should Be Disclosed?

Related resources