WO2022162470 - INTERLOCKING BACKPROBAGATION FOR AUTOMATED TRAINING OF COMPUTER PREDICTIVE MODELS

National phase entry is expected:
Publication Number WO/2022/162470
Publication Date 04.08.2022
International Application No. PCT/IB2022/000045
International Filing Date 28.01.2022
Title **
[English] INTERLOCKING BACKPROBAGATION FOR AUTOMATED TRAINING OF COMPUTER PREDICTIVE MODELS
[French] RÉTROPROPAGATION À VERROUILLAGE POUR LA FORMATION AUTOMATISÉE DE MODÈLES PRÉDICTIFS INFORMATIQUES
Applicants **
COHERE INC. 49 Spadina St. Unit 400 Toronto, Ontario M5V 0G9, CA
Inventors
GOMEZ, Aidan c/o Cohere Inc. 49 Spadina St. Unit 400 Toronto, Ontario M5V 0G9, CA
FROSST, Nicholas c/o Cohere Inc. 49 Spadina St. Unit 400 Toronto, Ontario M5V 0G9, CA
GOU, Zhen c/o Cohere Inc. 49 Spadina St. Unit 400 Toronto, Ontario M5V 0G9, CA
Priority Data
63/142,898   28.01.2021   US
17/585,380   26.01.2022   US
front page image
Application details
Total Number of Claims/PCT *
Number of Independent Claims *
Number of Priorities *
Number of Multi-Dependent Claims *
Number of Drawings *
Pages for Publication *
Number of Pages with Drawings *
Pages of Specification *
*
*
International Searching Authority
*
Applicant's Legal Status
*
*
*
*
*
Entry into National Phase under
*
Translation

Recalculate

* The data is based on automatic recognition. Please verify and amend if necessary.

** IP-Coster compiles data from publicly available sources. If this data includes your personal information, you can contact us to request its removal.

Quotation for National Phase entry

Country StagesTotal
China Filing1351
EPO Filing, Examination8075
Japan Filing587
South Korea Filing606
USA Filing, Examination2710
MasterCard Visa

Total: 13329

The term for entry into the National Phase has expired. This quotation is for informational purposes only

Abstract[English] A method for training the transformer model that strikes a middle ground between local and global learning by using interlocking backpropagation. Instead of training with one single global objective, or training with each accelerator having its own local objective, the method trains a large-scale network with auxiliary classification layers. The auxiliary classification layers use local losses to optimize a subset of the network. The local losses may be computed based on a group of processing units. Different groups of processing units may contain overlapping processing units such that there is indirect communication flow throughout the network.[French] L'invention concerne un procédé de formation du modèle de transformeur qui frappe une masse intermédiaire entre un apprentissage local et global par l'utilisation d'une rétropropagation à verrouillage. Au lieu de former avec un seul objectif global, ou de former chaque accélérateur ayant son propre objectif local, le procédé forme un réseau à grande échelle avec des couches de classification auxiliaires. Les couches de classification auxiliaires utilisent des pertes locales pour optimiser un sous-ensemble du réseau. Les pertes locales peuvent être calculées sur la base d'un groupe d'unités de traitement. Différents groupes d'unités de traitement peuvent contenir des unités de traitement se chevauchant de telle sorte qu'il existe un flux de communication indirect dans tout le réseau.
An unhandled error has occurred. Reload 🗙