Examining patent applications relating to artificial intelligence (AI) inventions: The Scenarios - GOV.UK

2022-09-24 08:48:23 By : Ms. Annie King

We use some essential cookies to make this website work.

We’d like to set additional cookies to understand how you use GOV.UK, remember your settings and improve government services.

We also use cookies set by other sites to help us deliver content from their services.

You can change your cookie settings at any time.

Departments, agencies and public bodies

News stories, speeches, letters and notices

Detailed guidance, regulations and rules

Reports, analysis and official statistics

Data, Freedom of Information releases and corporate reports

This publication is licensed under the terms of the Open Government Licence v3.0 except where otherwise stated. To view this licence, visit nationalarchives.gov.uk/doc/open-government-licence/version/3 or write to the Information Policy Team, The National Archives, Kew, London TW9 4DU, or email: psi@nationalarchives.gov.uk.

Where we have identified any third party copyright information you will need to obtain permission from the copyright holders concerned.

This publication is available at https://www.gov.uk/government/publications/examining-patent-applications-relating-to-artificial-intelligence-ai-inventions/examining-patent-applications-relating-to-artificial-intelligence-ai-inventions-the-scenarios

1. This document contains a set of ‘scenarios’ concerning inventions that involve artificial intelligence (AI) or machine learning (ML). It is designed to accompany the guidelines for examining patent applications relating to AI inventions. The guidelines are primarily concerned with the patentability of AI inventions in respect of the excluded matter provisions set out in section 1(2) of the Patents Act 1977.

2. The IPO considers that patents are available for AI inventions in all fields of technology. The scenarios in this document are intended to reflect and illustrate the wide range of diverse technical fields where AI inventions may be found.

3. Each scenario has a very brief description of how its AI invention works and an illustrative example of a patent claim. Each scenario includes a simplified assessment setting out our opinion on how each AI invention would likely be assessed in respect of section 1(2).

4. For the avoidance of doubt, we emphasise that this document is not a source of law. Our opinions on the patentability of the scenarios shall not be binding for any purpose under the Patents Act 1977.

5. The scenarios have been designed to focus on the question of excluded matter only. We have assumed that the claimed inventions are novel and non-obvious. We have also assumed that each scenario is sufficiently disclosed.

6. The assessments of excluded matter we give follow a simplified application of the four-step Aerotel approach. We have omitted detailed consideration of steps 1 and 2 of the Aerotel approach. Specifically: a. At step 1, we have simply assumed each claim is sufficiently clear and that no issues of construction arise. b. At step 2, we have simplified the assessment by stating what we consider is the actual contribution c. At steps 3 & 4, we have simplified the analysis by focussing in the main on the “program for a computer” exclusion of section 1(2). Unless otherwise indicated, our non-binding opinions are limited to this exclusion.

7. Any comments or questions arising from these scenarios should be addressed to:

Phil Thorpe Intellectual Property Office Concept House Cardiff Road Newport South Wales NP10 8QQ

Nigel Hanley Intellectual Property Office Concept House Cardiff Road Newport South Wales NP10 8QQ

Background This invention concerns a parking management system located in a car parking facility equipped with camera surveillance.

Images from the system’s cameras are processed by a first neural network which is trained to detect a vehicle approaching an entrance of the facility. When the first neural network detects an approaching vehicle in an image, the image is passed to a second neural network to implement an Automatic Number Plate Recognition System (ANPR) system. The second neural network is trained to identify a specific number plate region in the image. A recognition module receives the identified number plate region and applies an optical character recognition algorithm to determine the registration number of the vehicle.

A number plate recognition system comprising: an image capturing device positioned at an entrance to a parking facility; a computing device for receiving images from the image capturing device and comprising: a first neural network configured to detect a vehicle in a captured image; a second neural network configured to receive an indication of the vehicle from the first neural network, detect the presence of a number plate in the image, and determine a region of interest in which the number plate is located; and a recognition module configured to receive the region of interest and apply an optical character recognition process to the region of interest to determine characters of a registration number of the vehicle.

Beyond a conventional surveillance system having a camera and a computer, the contribution made by the invention is:

a number plate recognition system using a first neural network to detect a vehicle in a captured image; a second neural network to detect the presence of a number plate in the image and to determine a region of interest in which the number plate is located, and a module to apply an optical character recognition process on the region of interest to determine the characters of the vehicle’s registration number.

The contribution is a number plate recognition system which is not excluded by section 1(2). Although the number plate recognition system is computer implemented, it is more than a program for a computer as such because it is carrying out a technical process external to a computer. The number plate recognition system includes a combination of two neural networks and a recognition module that specifically perform image processing operations which are technical in nature (see Vicom). The system has a technical effect in the sense of signpost (i). It self-evidently solves a technical problem relating to the recognition of vehicle registration plates, so signpost (v) would also point to allowability.

The claimed invention is not excluded.

Gas supply systems are complex systems that are monitored by multiple sensors located in the supply system and in its operating environment. Typically, data from sensors can be combined and analysed by an operator to provide the operator with an overview of the operational state, both of the individual components within the system and of the system as a whole. This may assist the operator in identifying faults within the system and options for reconfiguring the system.

However, it is acknowledged that this approach requires specialised skills on the part of the operator and is prone to error, especially when data from large numbers of interconnected sensors must be considered. In particular, understanding the interdependency of changes made to the system is challenging.

This problem has been recognised by the inventor, who has developed an artificial intelligence system to receive and categorise sensor data relating to a gas supply system, identify faults, and recommend system configuration changes to resolve the faults. In making recommendations the AI system will analyse the effect a configuration change may have on the system. The system may implement a recommended configurational change to the system using an automatic operational controller.

A computer-implemented method of managing the operating state of a gas supply system using sensors within the gas supply system and in its operating environment, and characterised in that the method comprises an AI system: receiving and analysing data from the sensors; identifying fault conditions within the gas supply system based on the analysis; and reporting the fault conditions and generating a recommended solution to an automated operational controller of the gas supply system.

The contribution is managing the state of a gas supply system using an AI system that identifies fault conditions in the gas supply system, based on sensor data relating to the operation of the gas supply system, and reports the fault conditions and recommended solutions to an automated operational controller.

The contribution does not fall solely within the computer program exclusion. The contribution made by the invention is a solution to a technical problem lying external to the computer on which the AI system runs, namely the monitoring of the operation of an external technical system (a gas supply system) for fault conditions. This is a technical contribution. Signposts (i) and (v) point to allowability.

The invention defined in the claim is not excluded under section 1(2).

Analysing the motion of an object can be used to identify an activity. In some known examples, such as sporting events, analysing motion of an object can be useful for coaching. Alternatively, in gesture-based systems, a determined gesture can be used as a control mechanism or to issue an alarm. In one known example, a smoking cessation system generates an alarm in a wrist-worn device to deter the user from smoking.

Typically, these known systems operate by comparing real-time data to statistical models to determine motion, and they are heavily reliant on the accuracy of their statistical model. Consequently, systems that rely on the statistical models can be inaccurate.

The inventor has proposed a system that uses motion vectors derived from acceleration, velocity, and orientation in X, Y and Z directions as an input to a neural network to classify the motion. The system functions by receiving motion data in real time from a device such as a sports watch, or other motion sensors. The neural network processes the motion vector using a classification library to classify the motion to a particular movement.

A computer-implemented device for analysing motion comprising: a controller having a data interface, a neural network, and a movement classification library; sensors including a gyroscope, a magnetometer, and an accelerometer, wherein data from each sensor each is output to the controller via the data interface; characterised in that the controller is operable to: determine a motion vector from the received data; and provide the determined motion vector to the neural network, wherein the neural network is configured to classify the motion vector as one of a particular movement in the classification library.

The contribution is a device that determines a motion vector from data captured by its sensors (gyroscope, magnetometer, accelerometer) and uses a neural network and classification library to classify the motion vector as a movement from the library.

The contribution is not solely a program for a computer since its task is to perform a process of classifying measured sensor data describing the physical motion of a computing device. This process is a technical process lying outside the computing device and is carried out by technical means. It concerns the classification of real-world sensor data as a determined movement. Signpost (i) would point to patentability.

The invention defined in the claim is not excluded under section 1(2).

Cavitation in a pumping system is the formation of vapour bubbles in the inlet flow region of the pump, which can cause accelerated wear, and mechanical damage to pump seals, bearings and other pump components, mechanical couplings, gear trains, and motor components.

A pump system has a measuring apparatus adapted to measure pump flow and pressure data associated with the pump system. A classifier system detects pump cavitation according to the flow and pressure data. The classifier system comprises a neural network which is trained using back propagation. The measuring devices comprise sensors (1,2) for measuring input pressure and output pressure associated with an inlet (3) and an outlet (4), respectively, of the pump system. The flow through the pump is also measured.

1. A method of training a neural network classifier system to detect cavitation in a pump system, the method including the steps of: correlating each of a plurality of measured pump flow and pressure data pairs with one of a plurality of class values, thereby producing a training data set, wherein each of the plurality class values is indicative of an extent of cavitation within the pumping system and at least one of the plurality of class values is indicative of no cavitation in the pump system; and training the neural network classifier system using the training data set and back propagation.

2. A method for detecting cavitation in a pump system comprising: measuring pump flow and pressure data; detecting pump cavitation according to said flow and pressure data; wherein the detection step includes providing said flow and pressure data as inputs to a classifier system using a trained neural network, wherein the neural network provides a signal indicative of the existence and extent of cavitation in the pump system, and updating said signal (6) during operation of said pump system.

Beginning with claim 1, the contribution it makes is a computer-implemented method of training (i.e. setting up) a neural network classifier so it can detect cavitation in a pump system, where the method uses back propagation with a set of training data comprising measurements of pump flow and pressure from the pump system that are each correlated with a value indicating a corresponding extent of pump cavitation in the system.

Although the contribution relies on a computer program, it is more than a computer program as such. The contribution relates to a process of using physical data to train a classifier for a specific technical purpose, namely the detection of cavitation in a pump system. This is technical in nature. There is a technical contribution in the sense of signpost (i).

Claim 2 also reveals a technical contribution since it relates to the use of a trained classifier for the specific technical purpose of detecting cavitation in the pump system. A technical process lying outside the computer in the sense of signpost (i) is performed.

The invention defined in claims 1 and 2 is not excluded under section 1(2).

Many cars are fitted with catalytic converters to reduce the amounts of gases such as NOx and CO in their exhaust fumes. A problem for such converters is that their operational efficiency changes with the ratio of fuel-to-air in the combustion chambers of the engine. The fuel-to-air ratio must therefore be controlled to be maintained at a fixed value to maintain the efficient operation of the catalytic converter.

It is known to control the amount of fuel injected into an engine’s combustion chamber using feed forward control in relation to throttle position and additional feedback control in relation to an oxygen sensor (or air/fuel sensor) provided in the exhaust. Although this works well, it can be difficult to control the fuel-to-air ratio correctly when the engine is accelerating or decelerating.

The inventor has developed an injector control system which uses a trained neural network to determine an amount by which a given fuel injection amount should be adjusted during acceleration/deceleration to maintain a correct fuel-to-air ratio and thus maintain catalytic converteor efficiency. The neural network receives data inputs relating to the operational state of the engine, such as engine speed (RPM), intake air pressure, throttle position, fuel injection amount, air intake temperature, engine coolant temperature, and data from an exhaust gas sensor. The neural network outputs a signal indicating a change to the fuel injection amount for controlling the engine.

A computer-implemented neural network for adjusting the amount of fuel injected into a cylinder of a combustion engine, the neural network comprising: an input layer having: an input for receiving the RPM of the engine; an input for receiving intake air pressure of the engine; an input to receive current throttle position; an input to receive the present injected fuel amount; an input to receive air intake temperature; an input to receive water cooling temperature; an input to receive exhaust gas sensor data; at least one hidden layer, wherein the hidden layer is connected to the input layer; an output layer connected to the at least one hidden layer; and wherein the output layer has an output indicating an amount by which the fuel injection should be changed.

The contribution is a neural network that outputs a control signal relating to an amount by which fuel injection should be changed based on inputs relating to the operational state of the engine, as defined in the claim.

The contribution is a solution to a technical problem lying outside a computer, i.e. maintaining correct fuel-to-air ratio in an engine, so it is more than a program for a computer as such. The neural network takes as its inputs data representing the operating state of the engine and outputs a control signal indicating the amount by which a fuel injection amount should change. The control signal is suitable for controlling a technical process that exists outside of the computer on which the neural network runs. This is a technical contribution. Signposts (i) and (v) apply.

The invention defined in the claim is not excluded under section 1(2).

It is useful to measure the percentage of blood leaving each ventricle of a heart when determining the health of the heart. This measurement can be estimated by a skilled operator of an ultrasound imaging system by imaging a heart and marking out and measuring the boundaries of the ventricles of the heart at either extreme of a heartbeat. However, the accuracy of the operator’s estimate depends upon the operator’s skill and judgement.

The inventor has devised a method in which a trained neural network is used to provide a measurement of the percentage of blood ejected by a heart by analysing a series of images of the heart over a heartbeat. The neural network is trained using a supervised learning approach.

A computer-implemented method for determining a percentage of blood ejected from a given heart during a heartbeat, the method comprising: training a neural network with heart imaging data sets, each set comprising imaging data of a ventricle over time and associated blood ejection percentages, the sets being associated with different hearts; and using the trained neural network to: receive a set of imaging data of a ventricle of the given heart; output a percentage of blood ejection for the given heart.

The contribution is a method of estimating a percentage of blood ejected from a heart by training a neural network with heart imaging data which has been labelled with blood ejection percentage, and then obtaining an estimate for the percentage of blood ejected from a given heart over a heartbeat by providing a set of images of that heart (over its heartbeat) to the trained neural network.

The contribution is more than a program for a computer as such because it relates to an improved measurement of the percentage of blood ejected from a heart during a heartbeat. This is a technical measurement of a physical system. This improved measurement is an example of a technical effect upon a process lying outside the computer that implements the invention, following signpost (i). This is a technical contribution.

The invention is not excluded under section 1(2).

Traders on a trading exchange monitor the performance of various stocks and tradeable instruments to try to identify opportunities to make a beneficial trade. It requires specialist knowledge, understanding, and experience to recognise and identify patterns and trends in the market. This means traders will often specialise in a narrow range of instruments e.g. energy shares, financial derivatives, or commodities.

The inventor has recognised that this can result in beneficial trades being overlooked. A trader may either miss a trading opportunity for instruments held as part of their position or a chance to a reduce a loss or increase a profit from a transaction. To assist the trader, the inventor has developed an AI that can identify patterns and correlations between share and instrument prices, identify trades based on recent performance and timing differences, and predict future behaviours. One advantage the AI offers is the opportunity to ‘see’ connections that would be otherwise opaque and not obvious.

The AI is coupled to an automatic brokerage platform to allow it to execute trades according to profit/loss limits provided by the trader.

A computer-implemented financial instrument trading system comprising an exchange market, a broker terminal, an AI assistant, and an automated brokerage system, characterised in that the AI assistant is configured to: receive current and historical price data for tradeable financial instruments from the exchanges; cross reference combinations of financial instruments to identify correlated groups of instruments; identify trends within each correlated group; receive trader positions from the broker terminal; and based on the identified trends and trader positions, issue automated transaction instructions to the automated brokerage system.

The contribution is a computer implemented financial instrument trading system having an artificial intelligence assistant to monitor correlations between tradeable instruments and implement automated trades according to profit/loss limits set by a trader.

The contribution is wholly concerned with a method of doing business as such. All inputs and all outputs of the system relate purely to the trading of financial instruments whether they be prices or trading instructions. The result of the invention is no more than a financial instrument trading system. This does not count as a technical contribution.

The claimed invention is excluded from patent protection under s.1(2) because it is a method of doing business as such.

Identifying current and future health needs for people is a major undertaking. The very nature of health issues often means that a health system is reactive rather than proactive. As such, identifying future demand on a personal and on a population level is often difficult.

The inventor has discovered that it is possible to use machine learning techniques to analyse patients’ health records to allocate patients to risk groups or sub populations where a future health intervention may be required. This may allow health planners to “assess the population” and to identify suitable patient groups within the population for drug trials or alternative treatments.

The system uses an AI device to take in the health records of a patient population. The AI device identifies empirical variables in the health records, looks for correlations between variables, and uses correlated variables to create markers that can be used to identify groupings of patients in the population. The patient health records will necessarily include administrative records but may also include previous treatment histories and details of medical test results.

A computer-implemented method of identifying future medical needs of a population, performed by an AI device, the method comprising the steps of: inputting patient records into the AI device; aggregating data from the patient records; identifying a plurality of variables from the aggregated data; identifying correlations between the variables; allocating patients to groups on basis of the correlated variables; and outputting a range of heath metrics for each group.

The contribution of the invention is identified as: a computer implemented method of analysing patient data and grouping patients into groups on the basis of the analysis.

The contribution consists of solely excluded subject matter. It is a program that merely analyses information content of patient data and is no more than a program for a computer as such. The invention does not represent a technical process outside a computer, nor does it contribute to the solution of a technical problem external to a computer. Further, analysing patient data to determine patient groupings is a wholly administrative task and is no more than a method of doing business as such.

The claimed invention is excluded as a program for a computer as such and/or a method of doing business as such.

Unsolicited or mailshot e-mails are seen by many users as a nuisance. These e-mails often fill up mailboxes and potentially prevent the user seeing important communications. To deal with this situation many well-known rule-based systems exist to identify these messages as junk and move them to a junk e-mail (or similar low priority) mailbox.

The inventor has realised that these known approaches have limitations as one set of rules will not suit all users. What is junk e-mail for one user is not necessarily the case for another. Further, senders of junk e-mail will adapt what they send to any given rule set.

To counter these problems, the inventor has developed an AI system that learns through user feedback. The system works by parsing the text of all incoming e-mails using a trained AI classifier to classify the e-mail according to its content and semantic structure. The AI classifier is trained on a corpus of previously classified e-mails. The AI classifier classifies received e-mail as either junk, not junk, or unsure. E-mails that are identified as ‘unsure’ can be manually classified by the user. The e-mail and its classification are then used to appropriately adjust the training of the AI classifier.

The complete system provides a junk e-mail filtering system that adapts to the needs of the user and the changing behaviour of junk e-mail creators.

A computer implemented method of identifying a received electronic message as belonging to a class of message, the method comprising the steps of: parsing the content of the message using a trained AI classifier that classifies the content, according to its textual content and semantic structure, as either a first class, a second class, or not sure; if the message is classified as not sure, receiving input from a user classifying the message as either the first or the second class; and updating the training of the AI classifier using the user-classified message and its classification.

The contribution is a method of classifying e-mails, based on the textual content and sematic structure of the e-mail, as junk or not junk using an AI classifier, where when the AI classifier is not sure whether an e-mail is junk or not, it asks a user to decide and uses the result to update the classifier.

This contribution is no more than a program for a computer as such. Beyond merely asking for user input, the contribution does no more than analyse the text content of electronic communications to determine a classification for those communications. It consists of the mere manipulation of data which has no technical effect beyond the running of a program on a computer. There is no technical contribution. None of the signposts point to allowability.

The claimed invention is excluded under the program for a computer exclusion of s.1(2)

High performance cache memory or storage is commonly used in computer systems to improve system performance by mitigating the slower performance of an associated data store. Frequently accessed data held in the data store may be stored (cached) in the cache memory so when needed it can be retrieved quickly.

The overall performance of a memory system depends on choosing the data to be stored in the cache from its associated data store. Although a cache may start empty, once it is fully populated, the contents of the cache must be managed to maintain performance by removing and replacing the data stored in it.

There are two schemes for identifying data to be removed from the cache when adding new data to it. The first is to remove the least recently used (LRU) data. The second scheme removes the least frequently used but not new (LFU) data, One scheme may lead to better overall system performance than the other depending on the particular data stored in the cache.

The inventor has devised a method of managing the populating of a cache by using a trained neural network to identify whether the LRU scheme or the LFU scheme will lead to best system performance. The neural network takes, as its inputs, the data identified for removal by each of the LRU and LFU schemes along with other characteristics of the cache, e.g. its size and the ratio of the number of times requested data is found in the cache against the number of times it is not (known as the cache hit to miss ratio). The neural network then returns an indication of which scheme to use to ensure best system performance.

A method for managing data stored in a data cache for a data store in a computer system that uses the cache as a means to store frequently accessed data from the data store, the method comprising: using a first removal algorithm to identify first data to be removed from the cache; using a second removal algorithm to identify second data to be removed from the cache; providing the first data, the second data, a cache size value, and a cache hit-to-miss ratio as inputs to a trained neural network, wherein the trained neural network provides as its output a selection of either the first or the second algorithm to be used for removing data from the cache; and when adding new data from the data store to the cache, using the selected algorithm to remove data from the cache.

The contribution is managing the data in a cache by using a neural network to select an optimum removal algorithm for removing data from the cache, the selection being based on the data selected for removal by different algorithms and cache performance characteristics.

The contribution is more than a program for a computer as such because it is concerned with solving a technical problem to do with the internal workings of a computer, i.e. improving how the memory hierarchy in a computer works. The invention improves the operation of a computer regardless of the applications being run or the nature of the data being processed, and it makes the computer a more efficient and effective. It reveals a technical effect in the sense of signposts (ii), (iv), and (v).

The invention defined in the claim is not excluded under section 1(2).

Malicious actors may gain access to protected computer systems through obtaining valid authentication credentials of authorised users. They may, for example, use a phishing attack on the user to obtain their username and password.

The inventor has devised a method of identifying malicious actors should they gain access to a computer system, thereby allowing remedial action to be carried out. This is done by comparing the usage characteristics of malicious actors with those of the user with whose details they have gained access or whom they are impersonating

The method involves training a machine learning algorithm on an initial set of data representing a user’s characteristic usage for the computer system (for example the applications they use, their manner of typing, and their use of a mouse). Subsequently, when someone logs in with that user’s credentials, the trained machine learning algorithm is used to score the authenticity of the ‘user’ according to a newly measured characteristic usage for the computer system. If the machine learning algorithm indicates that the ‘user’ is likely not the authentic user, the system identifies the user as a malicious actor and remedial action can be carried out.

A computer-implemented method comprising: authenticating a user at a first time on a computer system; in response to authenticating the user at the first time, calculating, using at least one machine learning model, a behaviour characteristic score for the user, the behaviour characteristic score characterising interactions of the user with the computer system; authenticating a user at a second time on a computer system; in response to authenticating the user at the second time, calculating, using the machine learning model, a behaviour score characterising interactions of the user with the computer system; and determining that the user authenticated at the second time is a malicious user based on the calculated behaviour characteristic score and the calculated behaviour score.

The contribution made by the invention is the determination that a user logging into a computer system is a malicious user based on a behaviour score calculated by a machine learning model and a known behaviour characteristic score for a genuine user calculated by the machine learning model.

The contribution is more than a computer program as such because it is a solution to a technical problem lying within the computer system, namely the detection of a malicious intrusion. This is achieved by repeatedly monitoring the characteristic usage of the computer system by a user. This is an example of monitoring the internal workings of the computer system which is considered to be technical in nature. The invention works regardless of the applications being run and data being processed by the computer system. There is a technical effect at least in the sense of signposts (ii) and (v).

The invention defined in the claim is not excluded under section 1(2).

It is known for touch screen devices to enable text entry by displaying a virtual keyboard through which a user may enter text into an associated text box. Entering text in this way can be labour intensive and time consuming and is, given the relatively small size of many devices, prone to user error through mashing virtual keypresses.

The inventor has written a program that helps to alleviate the burden of text entry by using a trained recurrent neural network (RNN) to predict the next likely words (or string input) given the previous words or punctuations entered. The predicted words are ranked, and a selection of the most likely words is displayed to the user. The displayed word can then be selected by the user and used as the text entry from the virtual keyboard. The invention has the advantage of allowing the user to input desired text into the computer more accurately and by using fewer virtual key presses.

A method of text entry on a device displaying an interactive virtual keyboard, the method comprising: receiving at the device input from the virtual keyboard; providing the input to a trained recurrent neural network to predict and rank a selection of words that are likely to be the user’s next input; displaying at least two of the most likely words to be next entered; and receiving an input corresponding to a selection of one of the displayed words.

The contribution relates to predictive text entry on a device having a virtual keyboard, where a recurrent neural network predicts and ranks words most likely to be entered next based on previously entered text and allowing a user of the device to select one of the predicted words for input in the device.

The contribution is more than a program for a computer as such. The contribution is a solution to a technical problem concerning the very operation of the device itself, i.e. that of improving the speed and accuracy of text entry using a keyboard. The invention does this by predicting words for the user to select for input such that text is entered using fewer key presses. The invention improves the virtual keyboard, making the device a more efficient and effective device for the user to use. This is a technical contribution. Signposts (iv) and (v) seem to apply.

The invention defined in the claim is not excluded under section 1(2).

Neural networks can be very large and complex, with large numbers of parameters and involving many calculations. Handling the large number of parameters and calculations requires correspondingly high amounts of memory and processor resources. It is desirable to reduce these requirements whilst retaining the benefit of a trained neural network.

The invention achieves this by providing an initial neural network created and trained using a conventional approach as a base model. This trained neural network is then optimised using a rationalisation process to produce a simpler, optimised network that produces approximately the same outputs as the initial neural network within a predetermined tolerance level.

The rationalisation process may involve the removal of selective elements of processing being undertaken, for example by pruning nodes from the network. Elements may be removed because they are redundant or have little effect on the overall results of the network.

A computer-implemented method of generating a trained, optimised neural network comprising the steps of: a) processing input data using a trained base neural network to generate first output data; b) generating an optimised neural network by rationalising the trained neural network; c) processing the input data using the optimised neural network to generate second output data;d) comparing the first and second output data to determine a difference; and e) if the difference exceeds a predetermined threshold, generating a further optimised neural network by a rationalisation process; and f) repeating the steps c) to e) until the difference is below the threshold.

The contribution is a computer-implemented method of generating an optimised neural network by rationalising a trained base neural network, the output of the optimised network is compared to that of the base network and if the output differs beyond a threshold further optimised networks are iteratively generated until one is found whose output is within a threshold difference to the base network.

The contribution is no more than a program for a computer as such. The contribution is an iterative process of producing a simpler, optimised neural network starting from a base neural network. This is merely an iterative process for adapting one computer program (the program implementing the base network) to produce an optimised computer program (the program implementing the optimised neural network). The invention has not solved any technical problem with the computer itself. Any reduction in processing load or memory usage arises only as the result of the execution of program with fewer instructions. This is a circumvention of the problems of processor load and memory usage addressed by the invention. There is no technical effect beyond the mere running of a better or optimised program (as was found to be the case in Gale [1991] RPC 305). There is no technical contribution.

The invention defined in the claim is excluded under section 1(2) as a program for a computer as such.

It is generally considered desirable to reduce the amount of processing used to perform any computational task. This is particularly the case for neural networks. Processing particularly large networks may require significant computing resources.

The inventor has recognised that in many applications, a neural network is used to process several very similar data. For example, when input data is in the form of a time series, there may be very little change in the data appearing in successive time windows of the time series. Examples may include successive frames of video data or stock price information sampled at hourly intervals. If the difference between two successive time windows is sufficiently small, then it is likely that classifying the data in each of those windows with the same neural network will give the same result. In such cases classifying the data in successive time windows using the neural network leads to redundant or unnecessary processing effort.

The inventive system creates an indicator for each time window, for example by applying a known hash function to the data contained in a time window. The system uses the indictor to check for differences between the time windows. If the indicator for a given window is different to that for a preceding window, then data of the given window is submitted to the neural network for classification. However, if the indicator for the given window is the same then the classification produced by the neural network for the preceding window is simply re-used.

A computer-implemented method of processing a stream of time indexed continuous data using a neural network, the method characterised by having the steps of: processing a first portion of input data having a first time-index value to generate a first data indicator; using the neural network to generate a first output from the first portion of input data; storing the first data indicator in association with the first output; processing a second portion of input data having a subsequent time-index value to generate a second data indicator; comparing the second data indicator with the first data indicator; wherein if the second data indicator is different to the first data indicator: using the neural network to generate a second output; storing the second data indicator in association with the second output; and if the second data indicator is the same as the first data indicator: retrieving the first output.

The contribution is identified as a method of processing a data stream, having first and second portions of data, with a neural network in which: a first result for the first portion of the data stream is generated by processing it with the neural network, indicators for the first and second portions of the data stream are generated, and if the indicators are different then a second result for the second portion is generated by processing it with the neural network, whereas if the indicators are the same then a second result for the second portion is generated by re-using the first result for the first portion.

The contribution falls entirely within the program for a computer as such exclusion. The method avoids unnecessary execution of a neural network. It is only necessary to execute the neural network to classify a given data window if that data window differs from its preceding window. Although this may reduce processing overhead, this benefit is only felt when the computer is concerned with executing the inventive program. It is not an improvement made to the computer irrespective of the application(s) being run. It cannot be regarded as solving a technical problem to do with the internal workings of the computer itself. In short, there is not a technical effect beyond the mere execution of a better program. There is no technical contribution.

The invention defined in the claim is excluded under section 1(2) as a program for a computer as such.

Active training of a neural network involves testing the neural network to find its areas of weakness. Training data examples in the areas of weakness are then used to train the neural network so that it will have better all-round performance.

In order to provide a baseline for any training exercise, it is essential that a specimen dataset is used. This is a dataset that is known and can be said to have reliable and consistent expected results.

The inventor has realised that for each element of the specimen data processed by the AI, a confidence level for accuracy can be arrived at. For example, if the specimen data contains pictures of animals, it may be that the confidence level for identifying dogs is higher than for cats. By comparing the confidence level to a threshold, the user can identify poor performing areas. Once identified, further training data relating to the poor performing areas can be used to bolster the accuracy of the AI. The specimen data set is only augmented to the extent necessary to address the poor performing areas. In the example given, the specimen data would be augmented with additional pictures of cats. This is more efficient than simply expanding the dataset across all its elements. The new training data can also be added to the existing training set to provide a full product for an end user to train their implementation.

A computer-implemented method of training a neural network, the method comprising: training the neural network initially with a candidate set of training data; executing the initially trained neural network against a specimen input data set; for each element of the specimen data set determining a level of confidence in the accuracy of interpretation of the specimen data element by the initially trained neural network; and if the level of confidence for a given element is below a predetermined accuracy threshold then augmenting the training data with data related to the given element of the specimen data set and retraining the network using the augmented training data.

The contribution is a method of training a neural network involving determining by an initially trained neural network a level of confidence in accuracy of interpretation for each element of a specimen data set, and if the confidence level for a given element is below a threshold, then augmenting the training data with data related to the given element and retraining the network using the augmented training data.

The invention solves the problem of identifying what specific additional training data is needed to improve the accuracy of a neural network. This may result in a more efficient method of training, however it does not produce a neural network that itself operates more effectively or efficiently . No technical problem has been solved within the neural network . The identification of the specific additional training data, though carried out by a computer program running a neural network, is not a technical problem. The solution of that problem therefore does not provide a technical effect.

The claimed invention is excluded as a program for a computer as such.

Many modern computing devices, such as smart phones, include heterogeneous computing resources such as a CPU host processor, a graphics processor (GPU) and a neural network accelerator (NNA). Each of these heterogeneous computing resources has a different capability for carrying out the processing needed to perform the function of a neural network. This means it may be necessary to subdivide the processing tasks of a layer of a neural network into portions and assign the portions to the heterogeneous computing resources according to their respective capability, to optimise the performance of the neural network on the device.

It is desirable that all computing resources finish processing their portions of the layer at the same time (or rendezvous) since this allows for efficient running of each of the computing resources, for example by avoiding delays or stalls and by reducing idle time of the computing resources.

To achieve this desired outcome, the inventor has realised that respective computing resources can be made to finish the respective processing of their portions at nearly the same time by altering their clock frequencies.

A method of operating a neural network on a processing system comprising a plurality of processors, each processor having a differing neural network computing capability, the method comprising: determining a distribution of the processing of a layer of the neural network so that each of the plurality of processors is assigned a portion of the layer’s processing according to their respective neural network computing capability; determining the time that each processor will take to perform their portion of the layer’s processing; determining if a clock frequency of any of the processors should be altered to change the time the processor will take to complete its portion; distributing the portions to the respective processors; and in response to determining that a clock frequency for a processor should be modified, modifying the clock frequency for that processor when it processes its respective portion.

The contribution is a method of operating a neural network using heterogenous computing resources where the processing load for a layer of the neural network is shared out amongst the processing resources, and a clock frequency of at least one processing resource is adjusted to alter the time at which its portion of processing finishes.

The contribution is more than a program for a computer as such and is technical in nature. This is because it includes a process of operating a computer in a new way in a relevant technical sense, for example by controlling the clock frequency of one processor so that each heterogeneous processor finishes execution of its portion of a neural network layer at the same time. Signpost (iii) indicates patentability.

The invention defined in the claim is not excluded under section 1(2).

Machine learning models, such as neural networks, may require complex calculations to be performed by a processing unit (such as a hardware accelerator).

For example, a neural network may include one or more convolutional neural network layers for performing convolution calculations using input data. The processing of these layers typically involves numerous matrix multiplications involving very large matrices of input data. These sorts of calculations are computationally expensive to perform using existing processing units.

In addition, the nature of certain machine learning algorithms means that a large fraction of the input data for a given layer of a neural network have values that are zero. This means existing processing units perform a large number of unnecessary calculations that include multiplying one number (e.g. a convolution kernel value) by a zero value.

The inventor has devised a processing unit that can skip or bypass a computation upon seeing zero input values, having the advantage of making the processing unit more computationally efficient compared to known processing units.

A set of data values to be processed by a given neural network layer is received and stored in a memory of the processing unit. The processing unit has a control unit that checks the input data for zero and non-zero values. The control unit generates an address index that identifies only memory addresses of the memory storing non-zero input data values.

The control unit uses the address index to select memory addresses that store non-zero input data values and provides the non-zero input data values onto a data bus so that they can be processed by an array of processing elements.

A computer-implemented method for performing calculations for a neural network having a plurality of layers, the method performed by a processing unit having a memory, a data bus, a control unit, and an array of processing elements, the method comprising receiving, by the processing unit, a plurality of input data values to be processed by a layer of the plurality of layers; determining, by the control unit, whether each one of the input data values has a zero value or non-zero value; storing the plurality of input data values in the memory; generating, by the control unit, an address index identifying only memory address locations in the memory that store non-zero input data values; and providing, by the control unit, based on the memory address locations identified by the address index, the non-zero input data values from the memory over the data bus to the array of processing elements.

The contribution relates to performing machine learning calculations in a processing unit which has a control unit that determines whether received input data values to be processed by a neural network layer have zero or non-zero values, generates an address index identifying memory addresses at which non-zero input data values are stored, and that uses the address index to provide non-zero input data values to an array of processing elements.

Although the contribution is limited to the performance of machine learning calculations required by a given layer of a neural network, the contribution involves the generation of an address index that is effectively used to control the array of processing elements such they only process non-zero input data values stored in the memory. This is an example of operating a computer in a new way in a technical sense according to signpost (iii). The contribution is a technical solution to the problem of improving the computational efficiency of existing processing units according to signpost (v). The invention is more than a program for a computer as such.

The invention is not excluded under section 1(2).

It is well known to use distributed computing for machine learning tasks. For example, when iteratively training a neural network, a set of training data may be subdivided and shared out among the processing nodes of a distributed computer system. Each node may then process their training data to produce a respective partial training result, e.g. the amount by which one or more weights in a neural network should be adjusted. These partial results are then reduced (i.e. processed with a computing function) to produce a full result which is broadcast back to the nodes to update the machine learning model on each node before the next step of training occurs.

Several existing topologies (e.g. ring, torus) and data exchange methods (e.g. all-reduce and all-gather methods) have been devised to optimise the efficiency of distributed computing systems when processing workloads such as those found in machine learning. The inventor has devised a new topology and exchange method which can process machine learning tasks with improved efficiency.

The topology is shown in the figure above. Processing nodes (e.g. P0A-P3B) are arranged in groups. Each node of a group is connected to each other node in the group by a first and a second communication link. The groups are interconnected in rings so that each node is a member of a single group and a single ring.

The data exchange method used by the topology works iteratively. Each node of a group exchanges two data elements from an array of data elements with the other node(s) in their group, via the respective first and second links. Then, each processing node reduces each received data element with a data element found in a corresponding position in its stored data array through a process of sequentially sharing and combining the data elements.

A method of operating a computer comprising processor nodes arranged in groups and rings, such that all nodes in a single group are connected to each other node in the group by a first and second link, the groups being interconnected in rings so that each node is in a single group and a single ring, the method comprising: operating a machine learning collective where each processor node processes input data to generate an output data array of elements; exchanging data elements using exchange steps of the machine learning collective; and wherein in each exchange step the processing nodes of all groups exchange, via the respective first and second links, two data elements with the other nodes in its group, and wherein all processing nodes reduce each received data element with the data element in the corresponding position in the array on that processing node.

The contribution is a machine learning method using a new topology and data exchange method that optimises the performance of the machine learning task on a distributed computer system through the interconnection of the nodes and the manner of data exchange.

The contribution is not solely a program as such because, on top of being a machine learning method, it is concerned with a new computer topology and data exchange method. This is an example of a new arrangement of hardware which operates a computer system in a new way, in a relevant technical sense. A technical contribution is revealed following signpost (iii). The contribution is a technical solution to the problem addressed by the invention namely, how to arrange a distributed computer system to efficiently perform collective machine learning tasks although embodied as a program. Signpost (v) also points to patentability.

The invention is not excluded under section 1(2).

Don’t include personal or financial information like your National Insurance number or credit card details.

To help us improve GOV.UK, we’d like to know more about your visit today. We’ll send you a link to a feedback form. It will take only 2 minutes to fill in. Don’t worry we won’t send you spam or share your email address with anyone.