Index of /大语言模型/
../
Attention优化/ 16-Jul-2024 12:12 -
DeepSeek/ 07-Feb-2025 04:49 -
LoRA/ 17-Sep-2024 15:18 -
Metrics/ 17-Sep-2024 15:24 -
MoE/ 06-Sep-2024 05:46 -
PEFT/ 17-Sep-2024 15:16 -
PPO/ 17-Sep-2024 15:31 -
RLHF/ 17-Sep-2024 15:36 -
Reasoning/ 17-Sep-2024 15:39 -
qwen/ 05-Feb-2025 04:11 -
speech2speech/ 03-Jan-2025 06:01 -
tokenizer/ 10-Aug-2024 02:04 -
位置编码/ 04-Jul-2024 11:58 -
分布式推理/ 15-Jul-2024 05:23 -
分词器/ 21-Aug-2024 11:46 -
强化学习/ 05-Feb-2025 01:59 -
推理工程/ 02-Jan-2025 03:37 -
音频/ 14-Aug-2024 10:35 -
2309.05463.pdf 17-Sep-2023 09:42 387559
2309.16583.pdf 04-Nov-2023 09:22 597201
A Survey on Efficient Inference for Large Langu..> 11-Jun-2024 01:03 1280725
AIOS:LLM Agent Operating System.pdf 27-Mar-2024 01:04 569103
BitNet:Scaling 1-bit Transformer for Large Lang..> 18-Oct-2023 00:56 586469
Blockwise parallel decoding for deep autoregres..> 08-Apr-2024 01:42 382092
Byte Pair Encoding 10-Aug-2024 02:06 0
CLLMs:Consistency Large Language Models(解码生成).pdf 12-May-2024 09:48 1308041
EFFICIENTLY SCALING TRANSFORMER INFERENCE.pdf 13-Jan-2024 09:30 1036959
Efficient LLM Inference with Kcache.pdf 30-Apr-2024 01:06 327793
Fast Transformer Decoding: One Write-Head is Al..> 07-Nov-2019 01:56 142627
Improving Large Language Model Throughput with ..> 16-Dec-2023 13:39 789794
Keep the Cost Down:A Review on Methods to optim..> 14-Aug-2024 01:06 917652
LLM in a flash.pdf 13-Jan-2024 07:39 1092689
MobileLLM(Star Paper).pdf 26-Feb-2024 02:27 1204699
Online normalizer calculation for softmax.pdf 22-Jan-2023 21:19 147347
Scalable MatMul-free Language Modeling.pdf 19-Jun-2024 01:41 1257518
The Era of 1-bit LLMs.pdf 28-Feb-2024 02:54 463748
Towards Efficient Generative Large Language Mod..> 13-Jan-2024 07:42 1397102
TriForce:Lossless Accelerate of Long Sequence G..> 24-Apr-2024 01:04 1092813
W1 - W1.pdf 17-Sep-2024 16:00 3899392
W1.pdf 01-Feb-2024 01:10 2863304
s1:Simple test-time scaling.pdf 04-Feb-2025 03:00 1461512
w2.pdf 01-Feb-2024 01:10 3332109
w3.pdf 23-Jun-2023 13:56 2760622