Releases · hiyouga/LlamaFactory

Release list

v0.3.3: ModelScope Integration, Reward Server

hiyouga released this 03 Dec 14:17

v0.3.3

5fe3cce

New features

Support loading pre-trained models from ModelScope Hub by @tastelikefeet in #1700
Support launching a reward model server in demo API via specifying --stage=rm in api_demo.py
Support using a reward model server in PPO training via specifying --reward_model_type api
Support adjusting the shard size of exported models via the export_size argument

New models

Base models
- DeepseekLLM-Base (7B/67B)
- Qwen (1.8B/72B)
Instruct/Chat models
- DeepseekLLM-Chat (7B/67B)
- Qwen-Chat (1.8B/72B)
- Yi-34B-Chat

New datasets

Supervised fine-tuning datasets
- Nectar dataset by @mlinmg in #1689
Preference datasets
- Nectar dataset by @mlinmg in #1689

Bug fix

Improve get_current_device by @billvsme in #1690
Improve web UI preview by @Samge0 in #1695
Fix #1543 #1597 #1657 #1658 #1659 #1668 #1682 #1696 #1699 #1703 #1707 #1710

Contributors

billvsme, Samge0, and 2 other contributors

Assets 2

v0.3.2: Patch release

hiyouga released this 21 Nov 05:41

v0.3.2

7f54008

New features

Support training GPTQ quantized model #729 #1481 #1545
Support resuming reward model training #1567

Bug fix

Change default PPO parameters by @hannlp in #1553
Fix ChatGLM2&3 templates #1453 #1480
Fix #1548 by @Outsider565 in #1544
Fix #1263 #1550 #1558

Contributors

hannlp and Outsider565

Assets 2

v0.3.0: Full-Parameter RLHF

hiyouga released this 16 Nov 08:24

v0.3.0

95d0f77

New features

Support full-parameter RLHF training (RM & PPO)
Refactor llmtuner core in #1525 by @hiyouga
Better LLaMA Board: full-parameter RLHF and demo mode

New models

Base models
- ChineseLLaMA-1.3B
- LingoWhale-8B
Instruct/Chat models
- ChineseAlpaca-1.3B
- Zephyr-7B-Alpha/Beta

Bug fix

Fix bugs in partial-parameter (freeze) tuning
Fix #224 #336 #931 #936 #1011 #1489 #1494 #1507 #1514

Contributors

hiyouga

Assets 2

v0.2.2: Patch release

hiyouga released this 13 Nov 15:16

v0.2.2

ec334f5

Bug fix

Fix the OOM issue in PPO training by @mmbwf in #424
Fix fine-tuning arguments by @yyq in #1454
Refactor constants and evaluation by @hiyouga
Fix #1452 #1466 #1478

Contributors

yyq, hiyouga, and mmbwf

Assets 2

v0.2.1: Variant Models, NEFTune Trick

hiyouga released this 09 Nov 08:30

v0.2.1

a9cbca1

New features

Support NEFTune trick for supervised fine-tuning by @anvie in #1252
Support loading dataset in the sharegpt format - read data/readme for details
Support generating multiple responses in demo API via the n parameter
Support caching the pre-processed dataset files via the cache_path argument
Better LLaMA Board (pagination, controls, etc.)
Support push_to_hub argument #1088

New models

Base models
- ChatGLM3-6B-Base
- Yi (6B/34B)
- Mistral-7B
- BlueLM-7B-Base
- Skywork-13B-Base
- XVERSE-65B
- Falcon-180B
- Deepseek-Coder-Base (1.3B/6.7B/33B)
Instruct/Chat models
- ChatGLM3-6B
- Mistral-7B-Instruct
- BlueLM-7B-Chat
- Zephyr-7B
- OpenChat-3.5
- Yayi (7B/13B)
- Deepseek-Coder-Instruct (1.3B/6.7B/33B)

New datasets

Pre-training datasets
- RedPajama V2
- Pile
Supervised fine-tuning datasets
- OpenPlatypus
- ShareGPT Hyperfiltered
- ShareGPT4
- UltraChat 200k
- AgentInstruct
- LMSYS Chat 1M
- Evol Instruct V2

Bug fix

Fix full-parameter DPO training #1383 #1422 (inspired by @mengban )
Fix tokenizer config by @lvzii in #1436
Fix #1197 #1215 #1217 #1218 #1228 #1232 #1285 #1287 #1290 #1316 #1325 #1349 #1356 #1365 #1411 #1418 #1438 #1439 #1446

Contributors

anvie, mengban, and lvzii

Assets 2

v0.2.0: Web UI Refactor, LongLoRA

hiyouga released this 15 Oct 13:06

v0.2.0

d627ab4

New features

Support LongLoRA for the LLaMA models
Support training the Qwen-14B and InternLM-20B models
Support training state recovery for the all-in-one Web UI
Support Ascend NPU by @statelesshz in #975
Integrate MMLU, C-Eval and CMMLU benchmarks

Modifications

Rename repository to LLaMA Factory (former LLaMA Efficient Tuning)
Use the cutoff_len argument instead of max_source_length and max_target_length #944
Add a train_on_prompt option #1184

Bug fix

Fix numeric error caused by the layer norm dtype in 84b7486 [1]
Fix bugs in PPO Trainer by @mmbwf in #900
Fix #424 #762 #814 #887 #913 #1000 #1026 #1032 #1064 #1068 #1074 #1086 #1097 #1176 #1177 #1190 #1191

[1] huggingface/transformers#25598 (comment)

Contributors

ji-huazhong and mmbwf

Assets 2

v0.1.8: FlashAttention-2 and Baichuan2

hiyouga released this 11 Sep 09:55

v0.1.8

f3e638a

New features

Support FlashAttention-2 for LLaMA models. (RTX4090, A100, A800 or H100 GPU is required)
Support training the Baichuan2 models
Use right-padding to avoid overflow in fp16 training (also mentioned here)
Align the computation method of the reward score with DeepSpeed-Chat (better generation)
Support --lora_target all argument which automatically finds the applicable modules for LoRA training

Bug fix

Use efficient EOS tokens to align with the Baichuan training ( baichuan-inc/Baichuan2#23 )
Remove PeftTrainer to save model checkpoints in DeepSpeed training
Fix bugs in web UI by @beat4ocean in #596 by @codemayq in #644 #651 #678 #741 by @kinghuin in #786
Add dataset explanation by @panpan0000 in #629
Fix a bug in the DPO data collator
Fix a bug of the ChatGLM2 tokenizer in right-padding
#608 #617 #649 #757 #761 #763 #809 #818

Contributors

codemayq, kinghuin, and 2 other contributors

Assets 2

v0.1.7: Script Preview and RoPE Scaling

hiyouga released this 18 Aug 09:39

v0.1.7

6eed1db

New features

Preview training script in Web UI by @codemayq in #479 #511
Support resuming from checkpoints by @niuba in #434 (transformers>=4.31.0 required)
Two RoPE scaling methods: linear and NTK-aware scaling for LLaMA models (transformers>=4.31.0 required)
Support training the ChatGLM2-6B model
Support PPO training in bfloat16 data type #551

Bug fix

Unusual output of quantized models #278 #391
Runtime error in distributed DPO training #480
Unexpected truncation in generation #532
Dataset streaming error in pre-training #548 #549
Tensor shape mismatch in PPO training using ChatGLM2 #527 #528
#475 #476 #478 #481 #494 #551

Contributors

niuba and codemayq

Assets 2

v0.1.6: DPO Training and Qwen-7B

hiyouga released this 11 Aug 15:43

v0.1.6

d5f1b99

Adapt DPO training from the TRL library
Support fine-tuning the Qwen-7B, Qwen-7B-Chat, XVERSE-13B, and ChatGLM2-6B models
Implement the "safe" ChatML template for Qwen-7B-Chat
Better Web UI
Pretty readme by @codemayq #382
New features: #395 #451
Fix InternLM-7B inference #312
Fix bugs: #351 #354 #361 #376 #408 #417 #420 #423 #426

Contributors

codemayq

Assets 2

v0.1.5: Patch release

hiyouga released this 02 Aug 08:13

v0.1.5

ba61894

Fix LLaMA-2 template #307
Fix bug in preprocessing 968ce0d
Fix #294 #296

Assets 2

Releases: hiyouga/LlamaFactory

Release list

v0.3.3: ModelScope Integration, Reward Server

New features

New models

New datasets

Bug fix

Contributors

Uh oh!

v0.3.2: Patch release

New features

Bug fix

Contributors

Uh oh!

v0.3.0: Full-Parameter RLHF

New features

New models

Bug fix

Contributors

Uh oh!

v0.2.2: Patch release

Bug fix

Contributors

Uh oh!

v0.2.1: Variant Models, NEFTune Trick

New features

New models

New datasets

Bug fix

Contributors

Uh oh!

v0.2.0: Web UI Refactor, LongLoRA

New features

Modifications

Bug fix

Contributors

Uh oh!

v0.1.8: FlashAttention-2 and Baichuan2

New features

Bug fix

Contributors

Uh oh!

v0.1.7: Script Preview and RoPE Scaling

New features

Bug fix

Contributors

Uh oh!

v0.1.6: DPO Training and Qwen-7B

Contributors

Uh oh!

v0.1.5: Patch release

Uh oh!