Articles
Microsoft’s training small language models to outperform ChatGPT
Microsoft may have a $10 billion dollar investment and partnership with OpenAI for its ChatGPT Large Language Model but it looks like the company could be hedging its bets with its own smaller transformer technology
Microsoft researchers recently revealed its Phi-1 1.3B transformer-based language model beat much larger models including HumanEval, MBPP, and even partnered ChatGPT when tasked with coding.
Combing “textbook quality” from The Stack and The StackOverflow datasets to train the artificial intelligence of Phi-1 1.3B and the use of eight NVIDIA A1000 GPUs over four days produced 6 billion high-quality training tokens based on GPT-4 classifiers and 7 billion generated using GPT 3.5 standards.