♎
Limited AI
search
⌘Ctrlk
♎
Limited AI
  • Machine Learning
  • Deep Learning
    • Summary v2
    • Basic Neural Network
    • Basic CNN
    • Advance CNN
    • Basic RNN
    • Advance RNN
    • Attention Mechanisms and Transformers
      • Queries, Keys, and Values
      • Attention is all you need
      • The Transformer Architecture
      • Large-Scaling Pretraning with Transformers
        • BERT vs OpenAI GPT vs ELMo
        • Decoder Model框架
        • Bert vs XLNet
        • T5& GPT& Bert比较
        • 编码器-解码器架构 vs GPT 模型
        • Encoder vs Decoder Reference
      • Transformers for Vision
      • Transformer for Multiomodal
    • NLP Pretraining
  • GenAI
  • Statistics and Optimization
  • Machine Learning System Design
  • Responsible AI
  • Extra Research
gitbookPowered by GitBook
block-quoteOn this pagechevron-down
  1. Deep Learningchevron-right
  2. Attention Mechanisms and Transformers

Large-Scaling Pretraning with Transformers

More Details in NLP pre-train

Reference

  • https://d2l.ai/chapter_attention-mechanisms-and-transformers/large-pretraining-transformers.htmlarrow-up-right

  • https://d2l.ai/chapter_attention-mechanisms-and-transformers/large-pretraining-transformers.htmlarrow-up-right

Previous最短的最大路径长度chevron-leftNextBERT vs OpenAI GPT vs ELMochevron-right

Last updated 1 year ago