Discussion about this post

User's avatar
JP's avatar

The MoE architecture is clever and the 1M token context window is properly useful. There's a big difference between running these weights locally and using Alibaba Cloud's hosted API though. The data practices vary wildly across Chinese providers. Did a provider-by-provider breakdown here: https://reading.sh/which-ai-providers-wont-train-on-your-data-e38280ff9887

No posts

Ready for more?