Meet Azeez Saheed The Unilag Undergrad That Created Naijaweb A Dataset Of 230 Million Gpt2 Tokens From Nairaland

14 Days(s) Ago    👁 54
 
If you ask Saheed Azeez, about the difficulty level of creating Naijaweb, a dataset of 230 million GPT2 tokens based off Nairaland, he'll tell you it is easy. All you need is to know web scraping and data cleaning, he said. However, when he explained how he created Naijaweb, most of what he said flew