summaryrefslogtreecommitdiffstats
path: root/training/strategy/ti.py
Commit message (Collapse)AuthorAgeFilesLines
* FixVolpeon2023-04-151-19/+0
|
* TI via LoRAVolpeon2023-04-151-5/+4
|
* UpdateVolpeon2023-04-131-2/+1
|
* UpdateVolpeon2023-04-101-1/+1
|
* Fix sample gen: models sometimes weren't in eval modeVolpeon2023-04-101-13/+12
|
* UpdateVolpeon2023-04-101-1/+1
|
* UpdateVolpeon2023-04-091-1/+1
|
* UpdateVolpeon2023-04-081-6/+6
|
* Fix TIVolpeon2023-04-081-1/+2
|
* UpdateVolpeon2023-04-081-1/+1
|
* FixVolpeon2023-04-071-4/+2
|
* FixVolpeon2023-04-071-3/+4
|
* TI: Bring back old embedding decayVolpeon2023-04-041-1/+21
|
* Improved sparse embeddingsVolpeon2023-04-031-4/+4
|
* TI: Delta learningVolpeon2023-04-031-23/+0
|
* Lora: Only register params with grad to optimizerVolpeon2023-04-021-1/+0
|
* RevertVolpeon2023-04-011-19/+81
|
* FixVolpeon2023-04-011-1/+3
|
* Combined TI with embedding and LoRAVolpeon2023-04-011-58/+18
|
* Experimental: TI via LoRAVolpeon2023-04-011-26/+4
|
* Fix TIVolpeon2023-03-271-8/+10
|
* Sparse TI embeddings without sparse tensorsVolpeon2023-03-271-10/+8
|
* Improved TI embeddingsVolpeon2023-03-261-2/+1
|
* Refactoring, fixed Lora trainingVolpeon2023-03-241-10/+12
|
* UpdateVolpeon2023-03-231-2/+2
|
* UpdateVolpeon2023-03-011-1/+1
|
* Fixed TI normalization orderVolpeon2023-02-211-5/+10
|
* Embedding normalization: Ignore tensors with grad = 0Volpeon2023-02-211-4/+11
|
* Remove xformers, switch to Pytorch NightlyVolpeon2023-02-171-2/+2
|
* UpdateVolpeon2023-02-131-1/+1
|
* Fix Lora memory usageVolpeon2023-02-071-2/+0
|
* Add LoraVolpeon2023-02-071-9/+29
|
* Restored LR finderVolpeon2023-01-201-3/+2
|
* Move Accelerator preparation into strategyVolpeon2023-01-191-1/+21
|
* UpdateVolpeon2023-01-171-6/+8
|
* FixVolpeon2023-01-171-4/+5
|
* FixVolpeon2023-01-171-1/+0
|
* Make embedding decay work like Adam decayVolpeon2023-01-171-9/+5
|
* UpdateVolpeon2023-01-171-4/+4
|
* UpdateVolpeon2023-01-171-6/+13
|
* Training updateVolpeon2023-01-161-6/+7
|
* More training adjustmentsVolpeon2023-01-161-1/+1
|
* Handle empty validation datasetVolpeon2023-01-161-1/+1
|
* UpdateVolpeon2023-01-151-28/+26
|
* Removed unused code, put training callbacks in dataclassVolpeon2023-01-151-10/+10
|
* Added functional TI strategyVolpeon2023-01-151-0/+164